GPT-4o's Memory Breakthrough – Needle in a Needlestack(nian.llmonpy.ai) |
GPT-4o's Memory Breakthrough – Needle in a Needlestack(nian.llmonpy.ai) |
Just today it lied to me about VRL language syntax, tryin to sell me some python stuff in there.
Senior ppl will often be able call out the bullshit, but I believe for junior ppl it will be very detrimental.
Nether the less amazing tool for d2d work if you can call out BS replies.
I’ve seen claims during open AI demo that is there software can now pick up on extremely subtle emotional clues, how people speak. Then, it shouldn’t take much more to make it read between the lines and understand what people are intending to say, for example, by enumerating all possible interpretations and scoring them based on, many factors, including the current time, location, etc. In fact, by taking into account so much context in factors, the LLM‘s will be better than people the vast majority of the time understanding what a person meant, assuming they were genuinely trying to communicate something.
it will become very hard to lie because everyone’s personal LLM will pick up on it fairly quickly, and find tons of inconsistencies, which it will flag for you later. You will no longer be fooled so easily, and if it has the context of everything the person has said publicly, plus if the person gives permission for your LLM to scan everything they’ve said privately because you’re their Business partner or sexual partner, it can easily catch you in many lies and so on.
I predict that in the next 5 to 10 years, human society will completely change as people start to prefer machines to other people, because they understand them so well, and taken into account, the context of everything they’ve ever said. They will be thoughtful, remembering details about the person in many different dimensions, and use them to personalize everything. By contrast, the most thoughtful husband or boyfriend will seem like, a jerk seems now. Or a cat.
Humor and seductive conversation, will also be at a superhuman standards. People will obviously up their game too, just like when they do when playing the game go after Lee Sedol was totally destroyed by Alpha go, or when people start using Alpha Zarro to train for Chess. However, once the computers understand what triggers people to laugh or have sexual response, they will be able to trigger them a lot more predictively, they simply need more training data.
And bullshitting will be done on a completely different level. Just like people no longer walk to destinations but use cars to go thousands of miles a year, similarly people won’t interact with other people so much anymore. The LLM’s, trained to bullshit 1000 times better than any human, Will be undetectable and gradually shift public opinion as open source models will power swarms of accounts.
I think it very likely that gpt-4o was trained on this. I mean, why would you not? Innnput, innnput, Johnny five need more tokens.
I wonder why the NIAN team don't generate their limericks using different models, and check to make sure they're not in the dataset? Then you'd know the models couldn't possibly be trained on them.
Maybe some models hallucinate or even ignore your mistake vs others correcting it (depending on the context ignoring or calling out the error might be the more 'correct' approach)
Using limericks is a very nifty idea!
Using ctrl-f I was able to see that they were identical in one another.
Obviously this is a single sample but saying 90% seems unlikely. They were around ~80k tokens total.
I would note that LLMs handle this task better if you slice the two documents into smaller sections and iterate section by section. They aren’t able to reason and have no memory so can’t structurally analyze two blobs of text beyond relatively small pieces. But incrementally walking through in much smaller pieces that are themselves semantically contained and related works very well.
The assumption that they are magic machines is a flawed one. They have limits and capabilities and like any tool you need to understand what works and doesn’t work and it helps to understand why. I’m not sure why the bar for what is still a generally new advance for 99.9% of developers is effectively infinitely high while every other technology before LLMs seemed to have a pretty reasonable “ok let’s figure out how to use this properly.” Maybe because they talk to us in a way that appears like it could have capabilities it doesn’t? Maybe it’s close enough sounding to a human that we fault it for not being one? The hype is both overstated and understated simultaneously but there have been similar hype cycles in my life (even things like XML were going to end world hunger at one point).
Needle-in-a-needlestack contrasts with needle-in-a-haystack by being about finding a piece of data among similar ones (e.g. one specific limeric among thousands of others), rather than among disimilar ones.
This is such an anti-intellectual comment to make, can't you see that?
You mention "sample" so you understand what statistics is, then in the same sentence claim 90% seems unlikely with a sample size of 1.
The article has done substantial research
He's a much simpler and correct description that almost everyone can understand: it fucks up constantly.
Getting something wrong even once can make it useless for most people. No amount of pedantry will change this reality.
Also: would you expect random people to fare any better?
It's purported to be a major use case.
RULER is a much better test:
https://github.com/hsiehjackson/RULER
> Despite achieving nearly perfect performance on the vanilla needle-in-a-haystack (NIAH) test, all models (except for Gemini-1.5-pro) exhibit large degradation on tasks in RULER as sequence length increases.
> While all models claim context size of 32k tokens or greater (except for Llama3), only half of them can effectively handle sequence length of 32K by exceeding a qualitative threshold, Llama2-7b performance at 4K (85.6%). The performance exceeding the threshold is underlined.
1. The article is not about NIHS it’s their own variation so it could be more relevant.
2. The whole claim of the article is that Gpt4o does better, but the test your pointing to hasn’t benchmarked it.
When a person reads a book, they have an "overall intuition" about it. We need some way to quantify this. Needle in haystack tests feel like a simple test that doesn't go far enough.
Imagine, a truly dynamic and super personal site, where layout, navigation, styling and everything else gets generated on the fly using user's usage behavior and other preferences, etc. Man! ---------------------------------------------
{JSON} ------ You are an auditing assistant. Your job is to convert the ENTIRE JSON containing "Order Change History" into a human-readable Markdown format. Make sure to follow the rules given below by letter and spirit. PLEASE CONVERT THE ENTIRE JSON, regardless of how long it is. --------------------------------------------- RULES: - Provide markdown for the entire JSON. - Present changes in a table, grouped by date and time and the user, i.e., 2023/12/11 12:40 pm - User Name. - Hide seconds from the date and time and format using the 12-hour clock. - Do not use any currency symbols. - Format numbers using 1000 separator. - Do not provide any explanation, either before or after the content. - Do not show any currency amount if it is zero. - Do not show IDs. - Order by date and time, from newest to oldest. - Separate each change with a horizontal line.
We've needed an upgrade to needle in a haystack for a while and this "Needle In A Needlestack" is a good next step! NIAN creates a prompt that includes thousands of limericks and the prompt asks a question about one limerick at a specific location.
I used 4o last night and it was still perfectly aware of a C++ class I pasted 20 questions ago. I don't care about smart, I care about useful and this really contributes to the utility.
I wonder if it'll be better now. Will test today.
Key word: working
The bubble is real on both sides. Models have limitations... However, they are not toys. They are powerful tools. I used 3 different SotA models for that project. The time saved is hard to even measure. It's big.
You are aware that this is an obvious contradiction, right? Big times savings are not hard to measure.
I wrote a working machine vision project in 2 days with these toys. Key word: working... Not hallucinated. Actually working. Very useful.
People are deliberately self selecting themselves out of the next industrial revolution. It's Darwin Awards for SWE careers. It's making me ranty.
The general public does not know nor understand this limitation. At the same time OpenAI is selling this a a tutor for your kids. Next it will be used to test those same kids.
Who is going to prevent this from being used to pick military targets (EU law has an exemption for military of course) or make surgery decisions?
For example, each needle could be a piece to a logic puzzle.
Thanks
I asked GPT-4o for JavaScript code and got Python, so much for attention.
See BooookScore (https://openreview.net/forum?id=7Ttk3RzDeu) which was just presented at ICLR last week and FABLES (https://arxiv.org/abs/2404.01261) a recent preprint.
Could be a simpler setup than RAG for slow-changing documentation, especially for read-heavy cases.
How far off am I?
FABLES/booklist.md: https://github.com/mungg/FABLES/blob/main/booklist.md
/gscholar_related? FABLES: https://scholar.google.com/scholar?q=related:Y-Hx-kplbEUJ:sc...
/gscholar_citations? BoookScore: https://scholar.google.com/scholar?cites=1796862036168524911...
...
From that one day awhile ago: https://news.ycombinator.com/item?id=38347868#38354679 :
> "LLMs cannot find reasoning errors, but can correct them" [ https://arxiv.org/abs/2311.08516 ] https://news.ycombinator.com/item?id=38353285
It replied: "The text indicates that graphene sheets present high optical transparency and are able to absorb thermal radiations with high efficacy. They can then convert these radiations into electrical signals efficiently.".
Screenshot of the PDF with the relevant sentence highlighted: https://i.imgur.com/G3FnYEn.png
[0] https://www.routledge.com/Advances-in-Green-and-Sustainable-...
This doesn't mean you're wrong, though.
Generate 1000 generic facts about Alice and the same 1000 facts about Eve. Randomise the order and change one minor detail then ask how they differ.
This whole thing would have to be kept under lock-and-key in order to be useful, so it would only serve as a kind of personal benchmark. Or it could possibly be a prestige award that is valued for its conclusions and not for its ability to use the methodology to create improvements in the field.
GPT4o still can't do the intersection of two different ideas that are not in the training set. It can't even produce random variations on the intersection of two different ideas.
Further though, we shouldn't expect the model to do this. It is not fair to the model and its actual usefulness and how amazing what the models can do with zero understanding. To believe the model understands is to fool yourself.
I am many years out of my grade school years where I was required to read a multitude of novels every year and I guess years of mindless reddit scrolling + focusing on nothing but mathematics and the sciences in college have taken their toll: I read long articles or books but completely miss the deeper meaning.
As an example: my nerd like obsession with random topics of the decade before I was born (until I get bored) caused me to read numerous articles and all of Wikipedia + sources on the RBMK reactors and Chernobyl nuclear accident as well as the stories of the people involved.
But it wasn't until I sat down and watched that famous HBO mini seres that I finally connected the dots of how the lies and secretive nature of the soviet system led to the design flaws in the reactor, and the subsequent suicide of Valery Legasov helped finally expose them to the world where they could no longer be hidden.
Its like I knew of all these events and people separately but could not connect them together to form a deep realization and when I saw it acted out on screen it all finally hit me like a ton of bricks. How had I not seen it?
Hoping one day AI can just scan my existing brain structure and recommend activities to change the neuronal makeup to what I want it to be. Or even better since im a lazy developer, it should just do it for me.
It's hard, but if you have a piece of fiction or non-fiction it hasn't seen before, then a deep reading comprehension question can be a good indicator. But you need to be able to separate a true answer from BS.
"What does this work says about our culture? Support your answer with direct quotes."
I found both gpt-4 and haiku to do alright at this, but sometimes give answers that imply fixating on certain sections of a 20,000 k context. You could compare it against chunking the text, getting the answer for each chunk and combining them.
I suspect if you do that then the chunking would win for things that are found in many chunks, like the work is heavy handed on a theme, but the large context would be better for a sublter message, except sometimes it would miss it altogether and think a Fight Club screenplay was a dark comedy.
Interpretation is hard I guess.
Using LLMs for picking military targets is just absurd. In the future, someone might use some other variation of AI for this but LLMs are not very effective on this.
LLM's will of course also be used, due to their convenience and superficial 'intelligence', and because of the layer of deniability creating a technical substrate between soldier and civilian victim provides - as has happened for two decades with drones.
https://www.bloomberg.com/news/newsletters/2023-07-05/the-us...
I guess the future is now then: https://www.theguardian.com/world/2023/dec/01/the-gospel-how...
Excerpt:
>Aviv Kochavi, who served as the head of the IDF until January, has said the target division is “powered by AI capabilities” and includes hundreds of officers and soldiers.
>In an interview published before the war, he said it was “a machine that produces vast amounts of data more effectively than any human, and translates it into targets for attack”.
>According to Kochavi, “once this machine was activated” in Israel’s 11-day war with Hamas in May 2021 it generated 100 targets a day. “To put that into perspective, in the past we would produce 50 targets in Gaza per year. Now, this machine produces 100 targets a single day, with 50% of them being attacked.”
You'd be surprised.
Not to mention it's also used for military and intelligence "analysis".
>using an LLM for high risk tasks like healthcare and picking targets in military operations still feels very far away
When immaturity and unfitness for purpose has ever stopped companies selling crap?
I'm 100% on the side of Israel having the right to defend itself, but as I understand it, they are already using "AI" to pick targets, and they adjust the threshold each day to meet quotas. I have no doubt that some day they'll run somebody's messages through chat gpt or similar and get the order: kill/do not kill.
people are trying to sell this right now. maybe it won't work and will just create more problems, errors, and work for medical professionals, but when did that ever stop hospital administrators from buying some shiny new technology without asking anyone.
Humans make mistakes all the time. Teachers certainly did back when I was in school. There's no fundamental qualitative difference here. And I don't even see any evidence that there's any difference in degree, either.
Humans can be wrong, but they aren't able to be wrong at as massive of a scale and they often have an override button where you can get them to look at something again.
When you have an AI deployed system and full automation you've got more opportunities for "I dunno, the AI says that you are unqualified for this job and there is no way around that."
We already see this with less novel forms of automation. There are great benefits here, but also the number of times people are just stymied completely by "computer says no" has exploded. Expect that to increase further.
But most people do expect computers to be infallible, and the marketing hype for LLMs is that they are going to replace all human intellectual labor. Huge numbers of people actually believe that. And if you could convince an LLM it was wrong (you can’t, not reliably), it has no way around the system it’s baked into.
All of these things are really really dangerous, and just blithely dismissing it as “humans make mistakes, too, lol” is really naive. Humans can decide not to drop a bomb or shoot a gun if they see that their target isn’t what they expect. AIs never will.
However, now that computers can plausibly do certain tasks that they couldn't before via LLMs, society has to learn that this is an area of computing that can't be trusted. That might be easy for more advanced users who already don't trust what corporations are doing with technology[0], but for most people this is going to be a tall order.
Computers are final. You don't want things to be final when your life's on the line.
I've heard the same comparisons made with self-driving cars (i.e. that humans are fallible, and maybe even more error-prone).
But this misses the point. People trust the fallibility they know. That is, we largely understand human failure modes (errors in judgement, lapses in attention, etc) and feel like we are in control of them (and we are).
OTOH, when machines make mistakes, they are experienced as unpredictable and outside of our control. Additionally, our expectation of machines is that they are deterministic and not subject to mistakes. While we know bugs can exist, it's not the expectation. And, with the current generation of AI in particular, we are dealing with models that are generally probabilistic, which means there's not even the expectation that they are errorless.
And, I don't believe it's reasonable to expect people to give up control to AI of this quality, particularly in matters of safety or life and death; really anything that matters.
TLDR; Most people don't want to gamble their lives on a statistic, when the alternative is maintaining control.
With humans there is a chance you get things right.
yikes, mate, you've really misunderstood what's happening.
when a human fucks up, a human has fucked up. you can appeal to them, or to their boss, or to their CEO.
the way these crappy "AI" systems are being deployed, there is no one to appeal to and no process for unfucking things.
yes, this is not exactly caused by AI, it's caused by sociopaths operating businesses and governments, but the extent to which this enabled them and their terrible disdain for the world is horrifying.
this is already happening, of course - Cathy O'Neil wrote "Weapons Of Math Destruction" in 2016, about how unreviewable software systems were screwing people, from denying poor people loans to harsher sentencing for minority groups, but Sam Altman and the new generation of AI grifters now want this to apply to everything.
Analyzing surgical field...
Identified: open chest cavity, exposed internal organs
Organs appear gooey, gelatinous, translucent pink
Comparing to database of aquatic lifeforms...
93% visual match found:
Psychrolutes marcidus, common name "blobfish"
Conclusion: Blobfish discovered inhabiting patient's thoracic cavity
Recommended action: Attempt to safely extract blobfish without damaging organsSo the military already was using math to pick targets, this is just the next logical step, albeit, scary as hell step.
How are you supposed to say why a machine learning model produces different outputs from the same input? It's just a black box.
So it can’t really be worse if there’s just a RNG in a box. It may be better.
"Hallucination-free," indeed.
Would love to know what actual, contractual guarantees they place around that.
The Diamond Age.
I'm worried that a generation might learn that that's good enough.
When AI is in charge of controlling weapons, you get this: https://www.accessnow.org/publication/artificial-genocidal-i...
I really hope that these type of situations won't increase because the mental strain that put on some people in the org is not sustainable in the long run.
Furthermore... big mountains are easier to weigh v small individual atoms? I think it's a little more complicated than big is easy to measure...
I care little about the precision... I've got other priorities. It's the same as the time the internet saves me... Big. It's obvious.
I stand by my statement. It's hard to measure...
This isn't pedantry, it's science.
It has done more complex things for me than this and, sometimes, gotten it right.
Yes, it’s supposed to be able to do this.
You could also throw in vector similarity if you wanted to keep words as more synonyms or antonyms.
That's a brilliant and sustainable strategy. /s
Pretty much every element of the above statements is false. Heck, either your response to me, or this reply, seem to be examples showing that the first one is wrong.
No, that's one of the primary reasons for RAG.
Unless you have some evals showing that the previous results justifying RAG also apply to GPT-4o?
You'll never make senior management with that attitude. At worst, "mistakes were made" and look a bit sad.
(Or worse, that Google already had a copy because of Google Books and didn't think "might training on this explode in our face like that thing with the Street View WiFi scanning?")
To me, that’s useful intelligence. I can already search text for verbatim matches, I want the AI to understand that “thermal radiations” and “infrared light” are the same thing.
> "Graphene is a promising material that could change the world, with unlimited potential for wide industrial applications in various fields... It is the thinnest known material with zero bandgaps and is incredibly strong, almost 200 times stronger than steel. Moreover, graphene is a good conductor of heat and electricity with very interesting light absorption properties."
Interestingly, the first sentence of the response actually occures directly after the latter part of the response in the original text.
Screenshot from the document: https://i.imgur.com/5vsVm5g.png.
Edit: asking it "What absorbs infrared light and converts it into electrical signals?" yields "Graphene sheets are highly transparent presenting high optical transparency, which absorbs thermal radiations with high efficacy and converts it into electrical signals efficiently." verbatim.
It is rather implausible to say that an LLM will never be used for this application, because in the current hype environment the only reason the LLM is not deployed to production is that someone actually tried to use it first.
The trick though is learning how to prompt, and developing the sense that the LLM is stuck with the current prompt and needs another perspective. Funnily enough, the least amount of luck I've had is getting the LLM to write precisely enough for science (yay I still have a job), even without the confabulation, the nuance is lacking...that it's almost always faster for me to write it myself.
I can’t relate. Currently in university. Everyone is thankful to God ChatGPT exists. I’d think it must be a joke, or your daughter somehow managed to live in a social circle which doesn’t yet adopted chatbots for school purposes.
OpenAI have even added a feature to make the completions from GPT near-deterministic (by specifying a seed). It seems that no matter what AI companies do, there will be a vocal minority shouting that it's worthless.
The idea that we argue about safety... Seems reasonable to me.
The argument about its usefulness or capability at all? I dunno... That slider bar sure is in a wierd spot... I feel ya.
This is all to say that randomly distributed failures are more tolerable than a relatively smaller number of concentrated failures. Human errors are rather nice by comparison because they're inconsistent in locality while still being otherwise predictable in macroscopic terms (e.g.: on any given day, there will always be far more rear-endings than head-on collisions). When it comes to machine networks, all it takes is one firmware update for both the type & locality of their failure modes to go into a wildly different direction.
By definition, stats operate at the macro level. So, for instance, I may be a safer driver than the AI average. Should I give up control? I suppose it's also a matter of degree and there's the network effect to consider (i.e. even If I individually beat the average, I'm still on the road with others who don't).
So it gets a little more complicated and I'm also not sure the aversion to relinquishing control is strictly "emotional" (as in the irrational sense). There's something about the potential finality of a failure that goes along with autonomy and agency over one's own life. The idea that a machine could make a mistake that ends your life, and you never had a chance or say in that outcome is off-putting in ways that feel more rooted in rationality and survival than in emotion.
There was a time that amount mattered to me but if you're working any kind of dev/ops job it just doesn't any more.
sort alice.txt | diff - <(sort eve.txt)
That's not a task for an LLMThe idea presented elsewhere in this thread about using an unpublished novel and then asking questions about the plot is sort of the ideal test in this regard, and clearly on the other end of the spectrum in terms of a design that's testing actual "understanding".
A useful test would copy all Alice statements to Eve statements, then rewrite all of the Eve statements using synonyms, and then finally change one or two details for Eve.
However, without the limerick, 4o responded with: "The term "English top brass" typically refers to high-ranking officials or leaders within the British government, military, or other institutions. What they seek can vary widely depending on the context and the specific goals of their roles. Here are some general pursuits that might be sought by such individuals:
National Security: Ensuring the safety and security of the United Kingdom from internal and external threats is a primary concern. This involves defense strategies, intelligence operations, and counter-terrorism efforts.
Economic Stability: High-ranking officials often focus on policies and initiatives aimed at maintaining and improving the country’s economic health. This includes managing inflation, unemployment, trade relations, and economic growth.
Political Influence: Top brass often seek to maintain or expand their influence both domestically and internationally. This can involve diplomacy, forming alliances, and participating in international organizations like the United Nations or NATO.
Social Cohesion: Ensuring social stability and addressing issues such as inequality, healthcare, education, and social services are critical. This can involve implementing policies that promote social welfare and cohesion.
Public Policy Implementation: Leaders are responsible for developing and implementing policies that reflect the government’s priorities. This includes legislation, regulatory frameworks, and public administration.
Technological Advancement: Keeping the nation at the forefront of technological innovation is often a priority. This includes investments in research and development, supporting tech industries, and ensuring cybersecurity.
Environmental Sustainability: Addressing climate change and promoting sustainable practices are increasingly important. This includes policies aimed at reducing carbon emissions, protecting natural resources, and transitioning to renewable energy sources.
Cultural and Heritage Preservation: Protecting and promoting the country’s cultural heritage and national identity can also be a focus. This includes supporting the arts, preserving historical sites, and promoting cultural initiatives.
These pursuits are shaped by the current political climate, global trends, and the specific priorities of the leaders in question. Would you like more detailed information on any of these areas?"
I think it will fail, but this actually seems like the cleanest way to demonstrate it.
Edit: Ok no, I tried giving it a whole bunch of hints, and it was just making stuff up that was completely unrelated. Even directly pointing it at the original dataset didn’t help.
So the good news is that the NIAN score might be real, bad news is you can't rely on it to know what it knows.
Maybe I can't solve a bunch of mostly memorized math problems without a visual mnemonic aid. Someone seeing me fail the problems without the visual aid doesn't rule out me having partly memorized solutions.
Call these LLMs stupid all you want but on focused tasks they can reason decently enough. And better than any past tech.
Make defensive comments in response to LLM skepticism all you want— there are still precisely zero (0) reasons to believe they’ll make a quantum leap towards human-level reasoning any time soon.
The fact that they’re much better than any previous tech is irrelevant when they’re still so obviously far from competent in so many important ways.
To allow your technological optimism to convince you to that this very simple and very big challenge is somehow trivial and that progress will inevitably continue apace is to engage in the very drollest form of kidding yoursef.
Pre-space travel, you could’ve climbed the tallest mountain on earth and have truthfully claimed that you were closer to the moon than any previous human, but that doesn’t change the fact that the best way to actually get to the moon is to climb down from the mountain and start building a rocket.
https://www.idf.il/en/mini-sites/hamas-israel-war-24/all-art...
Probably this is due to confusion over what the term "AI" means. If you do some queries on a database, and call yourself a "data scientist", and other people who call themselves data scientists do some AI, does that mean you're doing AI? For left wing journalists who want to undermine the Israelis (the story originally appeared in the Guardian) it'd be easy to hear what you want to hear from your sources and conflate using data with using AI. This is the kind of blurring that happens all the time with apparently technical terms once they leave the tech world and especially once they enter journalism.
At most charitable, that means a person is reviewing all data points before approval.
At least charitable, that means a person is clicking approved after glancing at the values generated by the system.
The press release doesn't help clarify that one way or the other.
If you want to read thoughts by the guy who was in charge of building and operating the automated intelligence system, he wrote a book: https://www.amazon.com/Human-Machine-Team-Artificial-Intelli...
Given that the underlying premise of the story is bizarre (is the IDF really so short of manpower that they can't select their own targets), and given that the sort of people who work at the Guardian openly loathe Israel, it makes more sense that the story is being misreported.
AI is how it is marketed to the buyers. Either way, the system isn't a database or simple statistics. https://www.accessnow.org/publication/artificial-genocidal-i...
Ex, autonomous weapons like "smart shooter" employed in Hebron and Bethlehem: https://www.hrw.org/news/2023/06/06/palestinian-forum-highli...
The premise that the IDF would use some form of automated information processing to help select potential targets, in the year 2023?
There's nothing at all unrealistic about this premise, of course. If anything it's rather bizarre to suggest that it might be.
The sort of people who work at the Guardian openly loathe Israel
This sounds you just don't have much to say about the substantive claims of these reports (which began with research by two Israeli publications, +972 and the Local Call -- and then taken further by The Guardian). Or would you say that former two "openly loathe Israel" also? Along with the Israeli sources that they're quoting?
why are you... kinda going off like this?
On top of that, why do you follow me around over the course of ... months making these comments. Its really extreme.