Sequencing your DNA with a USB dongle and open source code(stackoverflow.blog) |
Sequencing your DNA with a USB dongle and open source code(stackoverflow.blog) |
There's a guy on YouTube doing diy gene therapy to treat his lactose intolerance so it's not exactly science fiction.
https://www.latimes.com/california/story/2020-12-08/man-in-t...
I'm really hoping someone will work on an open source "23andme@home" solution that ties all this together in an accessible way.
https://store.nanoporetech.com/us/minion.html
https://www.extremetech.com/extreme/190409-minion-usb-stick-...
On some of their larger devices (eg, the PromethION), they've moved outright to a "we lend you the device for free, you buy the consumables" model.
https://en.wikipedia.org/wiki/Nanopore#Inorganic https://nanoporetech.com/how-it-works/types-of-nanopores
To actually sequence DNA with this USB thingy you need to prepare a so called sequencing library - and for that you need a fairly well equipped lab - expensive reagents and years of practice and skill ... a mid level biology Ph.D can prepare these ...
in addition the flowcell sold by Oxford Nanopore often malfunctions and the whole run is a bust ... (behaves like this since 2014 ... so no, the technology does not seem to improve a whole lot)
On one hand, I would love to learn something new about my body.
On the other hand, what if the results tell me that I am predisposed to some horrible untreatable disease? Will I spend the rest of my days observing every little pain or discomfort and thinking "is this IT?"
Now I'm a Data Engineer doing backend work in public sector. :)
Here are some press releases related to articles I published during my PhD:
https://physics.illinois.edu/news/article/34064
https://www.sciencedaily.com/releases/2014/10/141014095320.h...
when i looked i was interested, but was turned off when i saw that the cost far outstripped commercial sequencing services.
London, UK https://biohackspace.org/
Brooklyn, NY, https://www.genspace.org/
Baltimore, MD, https://bugssonline.org/
Australia, https://foundry.bio/
https://abarry.org/dna-sequencing-in-our-extra-bedroom/
http://blog.booleanbiotech.com/sequencing-at-home-with-flong...
What if you could take the (binary) data file of your DNA and use it as input in the (recently remastered) Monster Rancher games to generate a monster? Apparently those games use external user-provided data (like music CDs, game discs etc.) to generate the monsters the player would then train and use (something I only recently learned about through gaming livestreams).
I'd actually like to see the level of jank that would come out of something like that.
Also your dna is bootstraped from your mothers cells. And the prenatal environment has quite a large effect on development so your simulation might end up quite different from you if we only started with your dna.
For example today we can already predict the color of the eyes and other phenotype from the DNA.
If you are able to observe enough samples of cell growth and their associated DNA, you probably can model and predict the statistics of a cell from their DNA. Because the cell is itself the result of a lot of chemical processes, the law of large number will help smooth those statistics.
Given that we have a lot of cells, the collective behavior is probably entirely governed by these statistics.
What was unthinkable 50 years ago, playing chess better than a human, it's now trivial for a $100 device.
And it's not necessarily required that to simulate the growth of a human you'll need to simulate the entirety of chemical reactions in all 50 trillion cells and all that.
I see open-source implementations of BWT-based indexes (FM-Index/FMtree) out there. Out of curiosity, does anyone know of anything using BWTs for compact indexes in more everyday uses (like full-text search), or alternately reasons it doesn't really work outside the genome-alignment use case? Likely it only 'pays for itself' if you really need the space savings (like, it's what makes an index fit in RAM) or else we'd see it in use more places. It'd still be kinda neat to actually see those tradeoffs.
The BWT sees strings as integer sequences. Either "ABC" and "abc" are two unrelated strings, or you normalize before building the index and lose the ability to distinguish between the two.
Search proceeds character-by-character backwards, jumping arbitrarily around the BWT using the same LF-mapping function as when inverting the BWT. You get cache misses for every character.
BWT construction is expensive, because you want a single BWT for the entire string collection. There is a ridiculous number of papers on BWT construction, as well as on updating and merging existing BWTs, but the problem has still not been solved adequately. If your data is measured in gigabytes, you can just pay the price and build the index, but a few terabytes seems to be the practical upper limit for the current approaches.
You can of course partition the data and build multiple indexes, but then you have to search for each pattern in each index. There is no way to partition the data in a way that different indexes would be responsible for different queries.
Last time it was analyzed the conclusion was that there was nothing actionable.
I guess in your case where nothing actionable is found it's benign. It will be the cases where there are risk factors for late onset things - cancer, diabetes, heart disease etc. where it would get sticky.
As for the case where nothing actionable is found- it's not benign. It's absence of information, not information of absence.
In some people's thoughts, making a better society is the first and most obvious thing to do with technology like this, not an accidental consequence of inconvenience. Fortunately, enough of those people are active in the world to make Main Street different to Wall Street, at least sometimes.
They sell swab kits directly, or via NFT purchase, for ~$500 for a 30x near complete sequencing (that's 30 passes for over 99.9% vs 0.2% for 23andme et al). The results are stored in an encrypted AMD SEV-E vault to be accessed by big pharma or individuals, only for specific markers, in exchange for the $GENE token paid directly to the genome owner. Figures touted are $50-80 per request. This token is burned as kits are sold, can be staked, offers rewards like DAO membership, can be gifted to charities researching specific diseases in various populations. It can act as a form of UBI in unbanked populations and puts your DNA back in your control.
To me it's the best use of web3 tech I've come across, so disclaimer, I am invested and a DAO member, but it's early in the project still. They are not quite ready for mass marketing. They are moving over to Polygon for very low transaction fees in January, will be launching the first joint NFT/kit sale (the next season might include personal genetically generated art) to fill the vaults with 10k sequenced genomes. They are over half way already through work with charities, but that is the magic number before big pharma can start making queries. Right now though they are quietly building and preparing before marketing plans kick in later in Q1.
Take a look at https://genomes.io where everything is explained in more detail, the team are presented and the tokenomics set out.
TL;dr - for $500 right now you can get your entire genome sequenced, stored in a vault to earn you passive income, if you agree to each query. But wait for the NFT vs buying directly, it will have more perks.
then you’d need a program like bwa http://bio-bwa.sourceforge.net/ to map your data.
then use https://samtools.github.io/bcftools/howtos/variant-calling.h... or something else to produce variants from the mapping results.
then compare your resultant vcf file to something like dbSNP: https://www.ncbi.nlm.nih.gov/snp/
at this point you can start generating a raw version of a 23andMe report.
https://www.ebay.com/itm/265148387179
Nanopore is still not quite ready yet for precise and high accuracy sequencing. Give it another five years.
If this can sequence flora, fungi and human DNA for about 10k - I'd buy it, just to experiment and deep dive. That is such a low barrier of entry it itself is interesting.
and i feel like nanopore is the VR of dna sequencing. it’s always just another few years off.
1. A completely genetically determined disease; a rare 100%-going-to-happen deal. (Which you would probably know about already, because your mother, or grandfather died from it...)
2. Some significant, but abstract risk modification.
With 1., you would know, you will get sick/die some time soon in the future, allowing you to live your life accordingly, die without regrets, prepared and so on. You can take that into consideration when planning for a family, taking job offers, procrastinating on the good life with work and retirement plans. Burn bright.
With 2., there is a very, very high chance lifestyle choice influence the stated risk, as obviously not everybody who got the polymorphism gets sick. So you can get your ass up, exercise, quit smoking and drinking, reduce stress, get regular check ups, ..., and avoid getting sick or reduce the impact/progression, in case you do.
I think, logically, knowing is always better than not knowing. But I understand how anxiety does tell a different story.
"Inaction breeds doubt and fear. Action breeds confidence and courage. If you want to conquer fear, do not sit home and think about it. Go out and get busy." --Dale Carnegie
"You gain strength, courage and confidence by every experience in which you really stop to look fear in the face. You are able to say to yourself, 'I have lived through this horror. I can take the next thing that comes along.' You must do the thing you think you cannot do." --Eleanor Roosevelt
"Fear is the path to the Dark Side. Fear leads to anger, anger leads to hate, hate leads to suffering." --Yoda
"The brave man is not he who does not feel afraid, but he who conquers that fear." --Nelson Mandela
"Nothing in life is to be feared. It is only to be understood.' --Marie Curie
"The key to change... is to let go of fear." --Roseanne Cash
"He who is not everyday conquering some fear has not learned the secret of life." --Ralph Waldo Emerson
"We should all start to live before we get too old. Fear is stupid. So are regrets." --Marilyn Monroe
"Fear keeps us focused on the past or worried about the future. If we can acknowledge our fear, we can realize that right now we are okay. Right now, today, we are still alive, and our bodies are working marvelously. Our eyes can still see the beautiful sky. Our ears can still hear the voices of our loved ones." --Thich Nhat Hanh
Perhaps a trusted middleman would be a solution: "just don't tell me about anything that is totally beyond my control".
If someone steals my DNA I can't stop them. But I can at least avoid being swept up in large scale DNA scanning and tracking efforts.
We could even do that, without knowing anything about DNA at all. Or predict tomorrows weather without satellites and computers.
I think you are a bit too enthusiastic about statistics, or too naive about complexity.
It's unlikely even if we improved computing hardware many orders of magnitude beyond all reasonable predictions, that the calculations would be able to simulate all the necessary details; most of our simulations now are based on many approximations due to hardware limitations.
As to the question of "what level of fidelity is required to turn a FASTQ of somebody's genome into an accurate model of the resulting human, with some sort of realistic environment also provided", that's so far beyond what is even remotely comprehensible it's not worth speculating about in terms of science fact; it's just fiction.
Is this also true for nanopores in protein sequencing? This HN comment from a few weeks back [1] pointed out recent progress but perhaps the tech is still not quite there.
Also prevents employment discrimination based on genetics.
https://www.genome.gov/about-genomics/policy-issues/Genetic-...
(disclosure: have had my DNA sequenced by multiple organizations, and it's publicly available)
"How could we know our vendor's vendor was using genetic information in their proprietary risk score?"
"How could we know our client's client was using our score for life, health, or auto insurance/employment/lending/etc decisions?"
It's a "can't unring a bell" situation and the gaps in the regulations and the incentives for bad behavior are enormous.
when i worked on https://github.com/iontorrent/tmap we thought it would be a good idea to do something like a “local alignment” (using https://en.wikipedia.org/wiki/Smith–Waterman_algorithm) after doing a lookup into a burrows wheeler transform on a substring of the “read.”
I'm curious: since there are only 4 bases in DNA, for genomic data, this seems rather inefficient. Is there any advantage in encoding the DNA with two bits per nucleotide?
source for 3.2 billion: https://www.ncbi.nlm.nih.gov/books/NBK21134/#!po=0.485437
In practice BWT alignment based tools may use a forward-index and a mirror-index of the reversed genome string (not reverse complemented). This dual index approach is important for dealing with mismatches strings. There's a nice example explaining this for an older tool named Bowtie [2]
With a two bit encoding and both indices it isn't uncommon for a genome index to take up several GB of RAM. For example, BWA uses 2-3 GB for its index [3].
[1] https://en.wikipedia.org/wiki/FM-index [2] https://academic.oup.com/bioinformatics/article/25/14/1754/2... [3] https://academic.oup.com/bioinformatics/article/25/14/1754/2...
There are some great computational benefits using 2 bit encoding for the BWT
Gattaca shows eugenics has been so vilified that the audience will root for a character who selfishly commits fraud, risking lives and scientific progress for his own vanity.
The really scary fact is that there would be no need for a police state and segregation. The genetically enhanced would just completely dominate an open and fair competition.
In the movie, either the genetic augmentation didn't work (as well) as expected, or their advantage caused the augmented people to become lazy because they got covered in undeserved status no matter how little or much they worked, as everything depended on which genes they have been "bred" for. Then someone with supposedly bad genes could run circles around them just by working hard.
Maybe some mixture, e.g. in order to protect those kids who fail at the task they have been "bred" for from considering themselves failed humans, gattaca's society adopted this model where they shower all kids in status who have the right genes. Maybe it's not the kids who are being protected but the companies selling the augmentations.
https://youtube.com/c/thethoughtemporium
He’s got a ton of other interesting projects, the DIY gene therapy is just one that stands out because it seems so risky.
The results have been pretty astounding. I found markers that pointed to poor response to a specific blood thinner my grandfather was put on before he passed. Currently I'm researching the cluster of Bipolar / ADHD / SAD symptoms I experience that all seem to trace back to a certain genotype of circadian rhythm genes I have (thank you, Sci Hub). To boot, some of the studies I've come across have been done on Han Chinese populations that match my descendance.
Perhaps going too far down this rabbit hole poses a self-diagnosis risk, but the correlations to my family history and my own life experience working with doctors to diagnose and treat symptoms are pretty undeniable. And given that your run-of-the-mill psychiatrist is going to treat you off of a DSM checklist, I feel much more confident knowing there have been genomic studies to back things up, since my doctor isn't up to date on this research, and finding one that would be will be difficult and expensive. I've shared the papers with my doc and he's been supportive, sometimes I feel like I should be getting a discount on services rendered.
Self-diagnosing is not the problem it is made out to be - I live with my symptoms 24/7, doctor sees me for 5 minutes. The amount of times doctors have missed fairly clear sign of trouble in my family is disturbingly high. A simple procedure, done in time, would have saved two people I know.
Unfortunately our educational system teaches you about mitochondrion, but not the practical difference between ibuprophene and paracetomol, or CRP.
https://en.wikipedia.org/wiki/Joseph_James_DeAngelo
On April 24, 2018, authorities charged 72-year-old DeAngelo with eight counts of first-degree murder, based upon DNA evidence; investigators had identified members of DeAngelo's family through forensic genetic genealogy.
But this could enable things like finding relatives which is what I got out of the comment about 23andme. Instead of all the data being centralized, storage and comparison could be distributed
Not sure what you are concerned about. What would you expect a bad actor to do with your DNA sequences? I'm genuinely curious.
Music is exactly the same notes, just a unique mix. So why is Sony upset that I want to stream their entire library? But jokes aside...
A few decades ago I fought the military on collecting my DNA. I stalled them long enough to get my honorable discharge and avoid that all together. It's funny you ask because the commander asked the same thing and joked "Are you afraid we are going to clone you?!" to which I replied, "No sir, you should be afraid you are going to clone me." and we both had a laugh because he knew I was right. The military are not fond of critical/free thinkers. One of me was plenty. I explained that insurance companies were already using this data to retroactively cancel peoples policies even if they were not actively afflicted by something. The commander showed me how to use the FOIA request system.
Laws have evolved a little since then but there are plenty of other risks. For starters, I can't easily change my DNA like I can change my debit card. That data can be used to tie me to others or guilt by association which is undesirable drama. It can also be used to try to sell me things. It can also be used to target biological weapons against specific groups of people. There appears to be an imbalance of data sharing in this regard. [1] Then there is simply the matter of privacy. If I want to share my DNA with some lab that is in turn going to sell it out to hundreds of other companies over and over forever, I should at very least be getting paid a vast amount of money and land and have legally binding contracts and NDA's that cover what is and is not allowed to be done with my data and how long it may be retained. That contract and the laws enforcing the contract must have some serious teeth with very serious ramifications for anyone violating it whether intentionally or by mistake.
I'm more curious what the actual threat might look like.
The marginal utility of your particular genome is miniscule. Without deep phenotypic information from biophysical parameters, it is utterly impossible to learn something novel from any single genome. This makes the marginal value of the genome information very low, both to you and any attacker or user. You would not be paid much for your data even if it was sold over and over because the rates are like those for plays on Spotify.
There are not fixed differences between human populations, and there are dramatic pressures to balancing selection that keep diversity focused in key genomic regions that are critical for immune response. This is to say that it would be damn hard to target any single group with a bioweapon. And if you wanted to target a single individual with a genomically targeted bioweapon, you also have physical access, making the problem of getting genomic information without consent trivial.
People often talk about insurance risk. I suppose that's an attack vector. It's also one that can be regulated with laws and social norms. Fwiw I wonder how often this is primarily an American concern.
Imagine a public genome data repository. People donate their genomes to science and post them there for the world to use and learn from. In my opinion, it would be better for an individual to share their data than not. The reasoning is that no matter what is done with the data, the net effect will be that society learns more about the individual's particular genome than those of people who haven't contributed. This will yield better adaptation of the society to the individual. Literally this might mean that a treatment for something affecting the individual is slightly better. In expectation, the worst thing that can happen is that the individual gains more information about themselves.
I'm curious about the possible abuse scenarios given the ubiquitous use of PCR-testing for nearly two years, now.
If I'm informed correctly for a viable sample for NGS you need like 2mL saliva (which sounds little but it really takes some time: >1 min) not those trace amounts which gets usually collected by the swabs?
No, genomes are not "almost the same" because they are all in base-4 sequences and this made up of the same 0s 1s 2s and 3s.
We are astoundingly similar, even unusually so for a large mammalian species.
Is there publicly available information on how accurate Guppy is, as well as how the amount of training data scales with improvements in accuracy?
It didn't seem like these things were mentioned explicitly in the Community Update, other than that it’s expected to continue improving, but a clearer roadmap would definitely be much more helpful.
In Spain, for example, we have a private system but it is extremely inefficient in some areas (and very good in others). Of course, you can have private insurance, but you still have to pay your social security. Curiously, the only ones who can decide which system they want are the public servants...
You cannot really avoid the fundamental constraints - anywhere in the world, there are only so many doctors and so much money available for treatments. IDK if USA has a shortage of doctors, but plenty of European countries do. A country like Romania just cannot give its doctors big enough wages to stop them from seeking employment elsewhere, where they will get five to ten times as much (UK, Germany, Switzerland). As a result, local hospitals are seriously understaffed.
Where I live, having personal connections to good doctors gives you an advantage - you will be examined and treated faster. Then there is outright nepotism.
The outgroups are different than in America, but there are always people for whom the system sucks.
What you say may be somewhat true in the context of transmuting the US's "private" bureaucracy into bona fide "government". But it's certainly not a "fundamental constraint" that's impossible to solve. Rather it's a failure of organization, whether critiqued in terms of bottom-up market failure or top-down governance failure.
This is incorrect. Most of the paperwork is done by administrative staff. Paying for that giant staff + the actual medical professionals is why things are so expensive.
Hospitals are not stupid, they won’t waste their most valuable resource (healthcare time) on bureaucratic paperwork.
X is the amount of medical care available
Y is the amount of medical care wanted
If Y < X there is no problem with any of the systems. And, obviously, a certain amount of inefficiency doesn't affect patient care. Plus, perhaps relevant today: when shit hits the fan we can scale up available care quickly.
If X > Y it doesn't matter which system you choose, someone will go without. You can change who goes without but you cannot fix the system by changing the method of dividing care.
Can things be somewhat improved with better organisation? Sure. Probably. But let's not overestimate it either. Let's take a dream scenario: optimal organisation can make 20% more care available. How much more care is wanted? I think we can safely say the US population wants 200% or more than the current system provides. Whilst nobody's opposed to improving organisations, it cannot fix the problem.
Fixing the problem is something you can only do by doubling the medical training available. That'll be a lot of extra dollars, none of which go anywhere near patient care for at least 10 years, so I would expect a lot of strong opposition from a lot of sides. But it's the only way to fix things.
You also still, in that case, need a gelbox + ladder + loading dye + sybrsafe or whatever, so it’s still not nothing.
And no, hospitals' most valuable resource are their billing computers. I think when it comes to providing actual healthcare hospitals are very stupid. You cannot partition any knowledge worker's attention into 10 minute blocks and expect them to achieve anything useful, yet that is what their entire system is designed around. The hospital doesn't have unilateral say of course (an "insurance" company won't pay one doctor the "price" of two if they spend twice as long with a patient), but they're still content optimizing within that status quo outcome - completely scatterbrained care.
And it's not like individual doctors are well rested or happy when you talk to them. The system clearly takes their toll on them (eg disappearing for 5 minutes to go retrieve test results that didn't show up before your appointment). In fact I'd say the vast majority of human talent in the medical system ends up completely wasted.
They are caused by high barriers of entry, which in turn are caused by entrenched elites gatekeeping jobs through absurdly high tuition fees, expecting everybody to take lots of student debt and a very litigation-friendly environment. These costs are then passed on to the general population through a byzantine system of health insurance that leaves a lot of people uninsured.
> Let's take a dream scenario: optimal organisation can make 20% more care available.
In 2020 UK spent 3278 GBP (~4400 USD) per capita on healthcare [1]. USA: 12,530 USD. That's about 3 times less or a difference of 200% [2].
In UK life expectancy is 81.2 years. In USA it is 78.79 years.
3 times more spent to get a worse outcome doesn't seem like "20% difference" to me. Of course there are other factors, but are they enough to overcome 3x difference? I don't think so.
You cannot compare healthcare systems on X doctors per Y patients basis, because the outcomes aren't linear. It's orders of magnitude more expansive to treat many health problems if you go to the doctor 2 years too late. And the outcomes are worse despite the higher costs. Guess what happens when people have to pay a lot for each visit - often they go too late.
[1] https://www.statista.com/statistics/472940/public-health-spe.... [2] https://www.cms.gov/Research-Statistics-Data-and-Systems/Sta...
Edit: I used to help Google fund researchers like Joe Derisi and others who develop technology to do this, and some of the people I worked with in my academic career are quite good at identifying serial killers from 30 year old DNA. If you're downvoting because you think I'm making this up, you're wrong. If you're downvoting because you don't think large-scale individual detection using genetic sampling of the environment is possible, you're wrong. If you're downvoting because you think you couldn't do a whole genome sequence of an individual using a sample collected in the wild, you're wrong. If you're downvoting because you think this is a terrible idea (morally, ethically), that's fine but I didn't say anything about my own moral or ethical beliefs about this.
It's simply factually correct to say that large-scale individual sample collection (at order tens of thousands, if not hundreds of thousands of individuals in a country the size of the US) is possible. All the technology is there to do this.
> Its "research programme information sheet", last updated on October 21, says the company retains data including "biological samples" and "the DNA obtained from such samples", as well as "genetic information derived from processing your DNA sample ... using various technologies such as genotyping and whole or partial genome sequencing".
The policy also says Cignpost may share customers’ DNA samples and other personal information with "collaborators" working with them or independently, including universities and private companies, and that it "may receive compensation" in return.
[1] https://www.telegraph.co.uk/news/2021/11/14/covid-test-firm-...
> L.A. County Sheriff Alex Villanueva .. was briefed by the FBI about “the serious risks associated with allowing Fulgent to conduct COVID-19 testing,” ... the FBI advised him that information is likely to be shared with China, and that the FBI told him DNA data obtained is “not guaranteed to be safe and secure from foreign governments.”
But also all the unintentional donations: Every pubic hair you lost on the toilet seat, every tampon you disposed, every bandage you ripped off and threw away, every mattress you slept on, chewing gum you've spit out, every ejaculation, every ... you get the idea.
That's why you need laws to regulate this.
I'm joking, I don't think you did anything wrong but I'd hate it if a ridiculous argument such as this example gained any traction :
Example : The government / aliens / whoever released the virus so we willingly gave them our DNA to sequence and match with our assigned ID so they can do XYZ in case the implant in the vaccine doesn't work or if we are smart enough not to get vaccinated.
Scary stuff.
At the base of it, if the gov. of the country one lives in is the enemy, it can’t be a matter of refusing vaccines here and there, that’s not the scale they should be thinking about.
Insurers have auditing requirements to prove what goes into the policy calculation. It is impossible to hide illegal data use at any meaningful scale, and no insurance agency is looking to save a buck on a small number of clients.
Your comparison is irrelevant.
https://genomebiology.biomedcentral.com/articles/10.1186/s13...
I guess there are limits to ensemble methods if the underlying accuracy doesn't increase. I don't work on gene sequencing algorithms but from what I understand of ML ensemble techniques, there are certain assumptions regarding the underlying independence of the errors. The errors for nanopore should be uniform but I am not sure. Any molecular biologist here care to comment?
There are two components that drive sequencing error rate. 1) The chemistry behind the sequencing (for nanopore sequencing this is the "feeding DNA through a pore" bit) 2) the method to convert raw signal into DNA sequence (this is called "base calling").
The gold-standard in terms of error profile for sequencing is currently the Illumina short read platform. Illumina machines are really just microscopes (TIRF scopes for optics folks) that sequence DNA by visualizing incorporation of dye-labeled nucleotides into the sequenced molecule(s) (Imagine a really slow PCR [1]). Each base is labeled with a different color, then when a molecule has a match it makes a colored spot on the slide that the machine can read (see here for more info & details of newer chemistry that use fewer colors [2]). This whole process is mediated by DNA polymerase which itself has a very low error rate. Another important point is that DNA sequenced on the illumina platform (called a "library") tends to be from "amplified" template DNA, meaning the DNA will have been processed and potentially be missing chemical modifications on the bases that could be present in the organism. This works to Illumina's advantage, because when trying to answer the question of "what is the DNA sequence?" we want the ground-truth DNA, not the modification state.
In contrast, Nanopore sequencing works by feeding a long strand of DNA through a pore and measuring the change in electrical current through the pore (watch the cool video [3]). For the current set of nanopore flowcells, 8 bases of DNA sit in the pore at a time, meaning the current at each timestep is a product of 8 nucleotides in aggregate. This also means that the pore "sees" each base 8 times, but always in the context of an additional 7. In order to basecall from the raw signal, it's not as easy as saying "blue = A", instead, you have to deconvolve each base from a complex signal. As you might imagine, the folks at Oxford Nanopore & broader research community have turned to machine learning-based base callers to solve this problem, and they work quite well [4]. But they are not perfect. Deconvolving runs of the same base (e.g. "AAAAAAA") is difficult because without well-defined signal changes between bases, the caller has a hard time deciding how many bases it has seen, so a common error mode for nanopore sequencing is to create insertions/deletions at places in the genome with low nucleotide diversity. Another interesting reason is that most Nanopore library preps are often performed on unamplified DNA, and so in addition to normal A/T/G/C nucleotides, the template DNA can also contain bases with chemical modifications. For example, in bacteria, A's are often methylated, and in Humans, C can have all kinds of different modifications (5-methyl-cytosine, 5-hydroxymethyl-cytosine, etc. etc.) and each different modification affects the signal in the nanopore. Therefore, basecallers that weren't trained on modified bases will produce basecalling errors in the presence of base modifications.
For both Illumina and Nanopore basecallers, they assign a quality score to each base that indicates the probability that the basecaller produced an incorrect value. This is called a Q-score, which is defined as "Q = -10(log10(P-value))" (i.e. Q / 10 = the order of magnitude of the error probability) [5]. For example, a Q-score of 10 means an error rate of 1 in 10, but a Q-score of 50 means an error rate of 1 in 100,000. For Illumina sequencing, >95% of the reads have a Q-score > 30 (i.e. 1 in 1000 errors), while Nanopore reads tend to have lower average Q-scores (~Q20, i.e. 1 in 100 errors). For genetics, where 1 base difference can mean the difference between a severe disease allele vs a normal variant, 1 in 100 won't cut it.
The current gen Nanopore flowcell chemistry (R9.4.1) is what most people are talking about when they talk about Nanopore error rates, but they've just released a new pore type & made some basecaller upgrades that improve the accuracy to what they call "Q20+" and some claims of Q>30, and from the data I've seen, it's impressive, I just haven't got my hands on one yet to see for myself [6]. I think the comment saying "wait 5 years" is an overestimate, but if you want to genotype yourself today, I'd just pay someone for Illumina sequencing and process the fastq files yourself if you really want to do it as a learning exercise.
I've unintentionally written an essay, so I'll stop here, but real quick to your other point RE: rerunning the sample N times & using the repeats for error correction. This won't work the way you're thinking because a "sample" is actually a collection of DNA molecules that are sampled randomly by the sequencer. You have no way of knowing that the same read between runs was actually from the same molecule, so you can't error correct this way. Consequently, a totally different sequencing platform from Pacific Biosciences uses this strategy by doing some really cool chemistry, but I'll spare you the second essay (google "PacBio HiFi" or "circular consensus reads" if you're interested).
[1] https://en.wikipedia.org/wiki/Polymerase_chain_reaction
[2] https://www.ecseq.com/support/ngs/do-you-have-two-colors-or-...
[3] https://www.youtube.com/watch?v=RcP85JHLmnI
[4] This paper is a tad out of date, but Ryan Wick always writes extremely clear papers: https://genomebiology.biomedcentral.com/articles/10.1186/s13...
[5] https://www.illumina.com/documents/products/technotes/techno...
[6] https://nanoporetech.com/about-us/news/oxford-nanopore-tech-...
Edit: reformatted links for clarity.
And RE: home sequencing, honestly the hardest part for a beginner will likely be the sample prep, since that takes some combination of wet lab experience and expensive equipment. I really wish molecular biology was as simple to get hacking on as writing software. The lag time between doing an experiment and getting a result is so much longer than waiting for things to compile, it just makes improving your skills take longer.