Search for a command to run...
The relation between the entropy (in thermodynamics) of Boltzmann and of Shannon information (in communication), has recently been discussed by Vopson (e.g.Refs [1,2]).Amongst several interesting similarities and differences pointed out by Vopson, one is that writing information to a computational device increases entropy.This assertion appears counterintuitive because entropy is often described as negative information.Nonetheless, it appears much more natural when expressed in terms of a definition of information proposed by the present author BR.There are, of course, other kinds of information, e.g., Fano mutual, and Chaitin information [3].These lie closer to the author's approach discussed below, but there are fundamental differences.Some have been touched upon by others (e.g., ), but these are relatively recent and lack aspects important to the present discussion.The original publications [7][8][9] were the consequences of a student project given to the present author, namely, to assess what information about three-dimensional structures of proteins were conveyed in their amino acid sequences when data was extremely limited.As data became more plentiful, a Theory of Expected information emerged that could also be applied more generally [10].As a predictive method, it resembles but preempted the Bayes Net [11].The relevance to the relationship between information and entropy is clearer in later applications that included the analysis of biomedical data, given as large spreadsheet [12][13][14][15].One wished to address the information required to write and (primarily) extract information as knowledge from a spreadsheet, e.g. each record (row) is a patient.The important thing is that the individual data elements are attributes of form 'attribute type':='attribute value', e.g.'systolic Blood pressure (mmHg)':=140.There are 8 main points; point 6 is important to the write-read-issue.(1) The number of attribute types each called say A, B, C,... and hence number of columns in a spreadsheet, is the so-called "explicit dimensionality", say N.(2) Each unique attribute, an individual data element such as 'systolic Blood pressure (mmHg)':=140, may be encoded by a distinct prime number 2, 3, 5, 7,..., the product of which, i.e. a composite number, encodes all the information in each row (record) analysed in turn.Note that attributes in a record can occur more than once with the same number code.This leads to many insights and useful algorithms, not least to relate the method of counting