Why does the SARS-Cov2 genome end in aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa? (2020)(bioinformatics.stackexchange.com) |
Why does the SARS-Cov2 genome end in aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa? (2020)(bioinformatics.stackexchange.com) |
Good observation! The 3' poly(A) tail is actually a very common feature of positive-strand RNA viruses, including coronaviruses and picornaviruses.
For coronaviruses in particular, we know that the poly(A) tail is required for replication, functioning in conjunction with the 3' untranslated region (UTR) as a cis-acting signal for negative strand synthesis and attachment to the ribosome during translation. Mutants lacking the poly(A) tail are severely compromised in replication.
Here I have a screenshot of the ApE editor displaying one plasmid I made for a neurobio experiment involving the overexpression of two chimeric proteins (actin and profilin, respectively linked to green and red fluorophores; note the editor has autotagged the polyA tail feature):
There's been a lot of analogies with NOP slides in the comments here and there, but if you look at how the process of reading the genome works, this section is more like the leader/trailer on a tape:
https://en.wiktionary.org/wiki/leader#Noun "A piece of material at the beginning or end of a reel or roll to allow the material to be threaded or fed onto something, as a reel of film onto a projector or a roll of paper onto a rotary printing press."
https://en.wiktionary.org/wiki/trailer#Noun "A short blank segment of film at the end of a reel, for convenient insertion of the film in a projector."
So props to them for going a bit more into detail here - and also highlighting that the reason for the fuzzy phrases can often be that we literally don't know the details: The empirical basis may be "if this thing is removed then this other thing won't work", without us necessarily knowing why this is the case.
In programming, we designed our systems to give us what seem like hard bottoms in our formal models. Most programmers don't reason below the level of their structured programming language. Of the ones that do, most treat the processor instructions as a hard bottom. There are layers down and down until you have physicists working on semiconductor properties, but we have intentionally designed the layers so that you can comfortably rest on them.
In biology any formal structure you think in is logically poised over the abyss. What pins it in place is not that it is on philosophical bedrock, but the observations and experiments that the formal structure summarizes.
If the virus could be bound with an artificial RNA strand that had a stronger bond than natural RNA, it could be denatured, and pooped out.
https://faseb.onlinelibrary.wiley.com/doi/10.1096/fasebj.31....
[0] https://berthub.eu/articles/posts/reverse-engineering-source...
I didn't realise there was so much crossover between embedded design and biology!
Just kidding...sort of!
So.. to me, it's odd they invoke "legitimate code." The comparison I'd consider would be "combative code." For example, the old game "core wars." Thinking in that mindset, I can see several uses for a "nop sled" in "legitimate code."
I also like how it is established that this has an effect on replication, but that as far as I understand we do not understand the underlying process. Humbling.
Are there probably desirable chemical properties? Yes. Is nature overloading each part of a genome with uses? More than likely. Has it figured out how to terminate a sequence? Obviously.
Honestly it's fuckin wild, there's a lot going on rather than just linear read->express
https://upload.wikimedia.org/wikipedia/commons/f/f4/Coronavi...
There might be a software analog to another polyA tail feature: the provision of a 'shelf-life'. Each replication cycle removes a few adenosines, and at a certain point the tail sequence is too short to recruit protection and the RNA is ushered into the degradation pathway.
During genomic assays, the poly a tail will not be a specific length, but a single consensus sequence is still provided.
This was also posted in the first comment:
> Similar to eukaryotic mRNA, the positive-strand coronavirus genome of ~30 kilobases is 5’-capped and 3’-polyadenylated. It has been demonstrated that the length of the coronaviral poly(A) tail is not static but regulated during infection; however, little is known regarding the factors involved in coronaviral polyadenylation and its regulation. Here, we show that during infection, the level of coronavirus poly(A) tail lengthening depends on the initial length upon infection and that the minimum length to initiate lengthening may lie between 5 and 9 nucleotides. By mutagenesis analysis, it was found that (i) the hexamer AGUAAA and poly(A) tail are two important elements responsible for synthesis of the coronavirus poly(A) tail and may function in concert to accomplish polyadenylation and (ii) the function of the hexamer AGUAAA in coronaviral polyadenylation is position dependent. Based on these findings, we propose a process for how the coronaviral poly(A) tail is synthesized and undergoes variation. Our results provide the first genetic evidence to gain insight into coronaviral polyadenylation.
Peng Y-H, Lin C-H, Lin C-N, Lo C-Y, Tsai T-L, Wu H-Y (2016) Characterization of the Role of Hexamer AGUAAA and Poly(A) Tail in Coronavirus Polyadenylation. PLoS ONE 11(10): e0165077
Now making a drug that targets only viruses and not your body RNA? Possible but it is so hard not much progress has been made.
Most of genetics is like that.
Yes, that is an understatement