| This question has a simple answer and a complicated one. The calculation starts like this : 3 billion base pairs - each of which can be one of four (A,C,G,T). So biology uses base 4 not base two - the result is around the 6 billion bit mark and comparable to the space on a hard drive.
However, this is where the analogy with computers breaks down. The genome's 'bits' can be further modified by methylation and acetylation. Considering only methyl groups, this means that each base can take on eight, not four, possible forms. Although this would seem to double the amount of information it doesn't work like that. Such meta-data is used for various other purposes, like chromatin assembly. Simply put, this is a little like data archiving or compression - although the reality is much more complex.
Indeed, the amount of information contained in any eukaryotic genome is a difficult question. On the analogous hard drive, how much of the data is information? Some bits represent files, others are empty - but there are files that have been 'deleted' and partially written over. Are these remnants pieces of information or junk which litters the drive. Oddly, the same situation is found on the genome; repetitive elements, old viruses, 'deleted' genes and so on. Nature is much more parsimonious than us and finds a use for everything, even so called junk.
What can you do with junk? Again, the computer is inadequate to illustrate this. DNA might seem like a digital medium, but it does have a definite physical basis. A particular sequence of bases has a definite shape, which can be subtly different from another sequence. This, in turn, might affect how the molecule coils in the local region and perhaps neighbouring ones. More importantly, noncoding bases can be methylated - which allows them to alter chromatin structure. |