Certified 100% AI-free organic content(substack.piszek.com) |
Certified 100% AI-free organic content(substack.piszek.com) |
That is a rather boldly positive perspective.
History show us how great technologies can come and go and is never distributed uniformly. Think sewers and drainage systems for example.
Obviously sophisticated technologies can be used in odious endeavors, where more efficiency means more atrocity. Think genocides for example.
I don’t see generative AI ever going.
oh.
1. When ambitious young cooks / farmers etc set out to make a name for themselves in the food industry, I don't get the impression they're rushing towards kosher food development en masse. But I could be wrong.
2. Over the course of my life I've observed (and lived) the response to the dissolution with mass market factory farm food to be a movement towards local, organic, small batch and visibly produced / performed food production.
3. I want to believe a distaste or boredom with ai produced content will lead us all offline and make us more focused on art, music, writing, advice etc produced by people we actually know in real life. Like a farmers market but for everything.
4. If I had to guess though I'd say its more likely in the short term to just increase the popularity of live streaming everything.
I should really have had chatGPT proof read this for me and optimize it for upvotes.
CERTIFIED ORGANIC AI-FREE POST
I felt reading the article that AI/ML vs "organic" content will become the difference between fine hand-crafted furniture and flat-pack Ikea stuff.
Sure, the flat-pack is better than nothing for a temporary apartment, but it is something to move beyond, not what we want to be stuck with for the rest of our lives and in all situations.
CERTIFIED ORGANIC AI-FREE POST
That said, if it gets good enough that you have to be an expert to tell AI-generated stuff from human-made, does it really matter? The upper classes will always play the silly "look what I can afford!" status games, and these often don't have an utilitarian component to them - something doesn't need to be actually better, it just needs to be recognized as more refined. Well, and cost more.
While this does sound good to most people, it's always worth remembering that Old South plantations produced nothing but 100% organic cotton, despite their atrocious practices of human slavery.
If we include this perspective, AI has some potentially extremely positive potential. One of the most promising trends in agriculture is the adoption of AI to the problem of weed and pest control, as well as fertilizer use. AI robots crawling over fields can use image recognition to identify weeds (and kill them with IR lasers) as well as pest infestations in their early stages. Application of fertilizer to individual plants on as as-needed basis (rather than just dumping large volumes on the entire field) also becomes possible. This can be done far faster and more efficiently than by human laborers walking up and down rows of crops, weeding by hand. Potentially, this could make organic-style agriculture as cost-effective as the industrial variety.
Sure, this means fewer field labor agricultural jobs - which are pretty tough, backbreaking jobs by any measure. Similar arguments apply to a lot of the drudgery in creative and artistic endeavors.
Of course, AI could also replace most corporate board positions, and a lot of upper management as well, and hey, why not the shareholders and investors too? Are they of any more value to the operation of a business than the replaceable grunt workers are?
Let's make the world's smartest AI and have it make decisions about how capital should be allocated to promising startups, removing fallible humans from the loop. Resource optimization on steroids!
Yes, I read too much science fiction. See William Gibson, Iain M. Banks, Hannu Rajaniemi, and Adrian Tchaikovsky for examples of what can go wrong (and maybe right).
I've had a lot of fun having ChatGPT write stories for me: I'd ask it make changes, to add a character, add a motivation, etc. I'm just playing around, and it's 100% produced by a robot and I enjoy it. I don't personally mind having an emotional response by a story generated by a "robot". I don't really understand how it being bot generated cheapens the experience. The emotion that I feel are elicited by my thoughts and reflections based on what I've read and experienced, not by the robot.
I didn't really understand this line either. I guess my answer is yes? Who cares?
So, just like with good cheese, maybe i will go with full-fat
I certainly wouldn't mind. AI-generated content is a statistical summarization of knowledge produced by infinite humans randomly typing on typewriters, after all...
I didn't know that. Makes the Google <-> OpenAI rivalry more interesting.
However, we will have a difficult time when it comes to social media because AI content will become more and more, until it is just.
It seems plausible the next social media will remove the social part completely, and just have people as consumers, creators.
I have noticed it makes me get bored of reading content, and I start to skim through it assuming it's just AI generated waffle.
On the other hand, I find Stable Diffusion the most interesting thing going on in art at the moment...
>can you give me the most common symptom of every vitamin deficiency
Sure, here is a list of common symptoms associated with deficiency in certain vitamins:
Vitamin A: night blindness, dry skin, and poor immune function
Vitamin B1 (thiamine): fatigue, irritability, and muscle weakness
Vitamin B2 (riboflavin): cracks and sores at the corners of the mouth, itchy and scaly skin, and light sensitivity
Vitamin B3 (niacin): pellagra, which includes symptoms such as skin inflammation, diarrhea, and dementia
Vitamin B5 (pantothenic acid): fatigue, insomnia, and muscle cramps
Vitamin B6 (pyridoxine): anemia, skin disorders, and depression
Vitamin B7 (biotin): hair loss, scaly skin, and brittle nails
Vitamin B12: pernicious anemia, which can cause fatigue, muscle weakness, and nerve damage
Vitamin C: scurvy, which can lead to bleeding gums, easy bruising, and poor wound healing
Vitamin D: rickets in children, and osteomalacia in adults, which can cause muscle weakness, bone pain, and fractures
Vitamin E: nerve damage, muscle weakness, and impaired vision
Vitamin K: bleeding disorders, easy bruising, and poor blood clotting
It's important to note that these symptoms can also be caused by other factors and deficiency in vitamins can manifest differently between people. Consult with a doctor if you suspect a deficiency.
>can you list the most recommended food source for each vitamin
Sure, here is a list of some food sources that are high in certain vitamins:
Vitamin A: sweet potatoes, carrots, kale, spinach, and apricots Vitamin B1 (thiamine): whole grains, nuts, seeds, and legumes Vitamin B2 (riboflavin): almonds, mushrooms, and dairy products Vitamin B3 (niacin): chicken, fish, and peanuts Vitamin B5 (pantothenic acid): mushrooms, avocado, and sweet potatoes Vitamin B6 (pyridoxine): chicken, fish, and chickpeas Vitamin B7 (biotin): eggs, nuts, and sweet potatoes Vitamin B12: fish, meat, and dairy products Vitamin C: oranges, strawberries, kiwi, and bell peppers Vitamin D: fatty fish such as salmon, mackerel and sardines, egg yolks and mushrooms Vitamin E: almonds, sunflower seeds, and spinach Vitamin K: leafy greens like spinach, kale, and broccoli, also found in parsley, Brussels sprouts and even kelp.
Hulk Hogan would approve https://www.youtube.com/watch?v=0-dI4Sl2rzE
The fear reminds me of the early days of hip hop, where songs remixed from the past were blasted as unoriginal. I think we've all mostly agreed now that you can build new content that honors the old whole being completely fresh.
And she cried for a minute before starting to laugh that she was crying from a story made by a bot.
EDIT: Okay, now I'm shocked to see your comment grayed out all the way at the bottom. Interesting times we live in.
At the level of our personal experience of a story, image, or sound it does not matter if the content was human generated or even if the stories are real; they either elicit an emotional response or they don't. This is why we can enjoy fiction and art in the first place. It's also why a given piece of art can mean different things to different people.
Deliberately cutting yourself off from the meaning a piece could have for you just because you aren't yet sure that a human produced it is unnecessarily specist and only acts to your own detriment. Should one likewise refuse to engage with works that had assistance from the artist's pets?
With regard to news stories and other informational content, if two stories about the same event are both accurate, timely, informative, and/or insightful, it does not matter if one was written by a human on a typewriter and the other written by a bot after having been prompted by a human. The present state of the art still requires some human editing to hit those marks but we're quickly approaching a time where there will not be any detectable differences.
This idea that people deserve to invent a new category of things to be prejudiced about seems silly and unhelpful to me. Why bother inventing arbitrary distinctions just so people can play favorites and be judgemental about it? People are already free to judge content based on relative merits. That seems like enough to me.
And it's important to account for the fact that the end goal with these tools is that there will be no differences between human and AI generated content. The teams working on these things have already made considerable progress to that end, it's not hard to imagine the next couple of iterations actually achieving that.
So anyone trying to create these arbitrary labels is staring an uphill battle for a fundamentally unhelpful end state. We really don't need to put that kind of energy into reinforcing people's desire to feel superior. We have more than enough elitism/triablism/racism/otherism around as it is.
I just changed all of my websites to have "100% AI-free organic content" in the copyright footers. (I didn't say "certified" for obvious reasons.)
I find this to be a particularly interesting problem in this whole debacle.
Could we end up having AI quality trend downwards due to AI ingesting its own old outputs and reinforcing bad habits? I think it's a particular risk for text generation.
I've already run into scenarios where ChatGPT generated code that looked perfectly plausible, except for that the actual API used didn't really exist.
Now imagine a myriad fake blogs using ChatGPT under the hood to generate blog entries explaining how to solve often wanted problems, and that then being spidered and fed into ChatGPT 2.0. Such things could end up creating a downwards trend in quality, as more and more of such junk gets posted, absorbed into the model and amplified further.
I think image generation should be less vulnerable to this since all images need tagging to be useful, "ai generated" is a common tag that can be used to exclude reingesting old outputs, and also because with artwork precision doesn't matter so much. If people like the results, then it doesn't matter that much that something isn't drawn realistically.
This is exactly what I don't like about Copilot, maybe even more than the IP ethics of it. If it really succeeds, it's going to have a feedback loop that amplifies its own code suggestions. The same boilerplate-ish kind of code that developers generate over and over will get calcified, even if it's suboptimal or buggy. If we really need robots to help write our code, to me that means we don't yet have expressive enough languages.
Love the idea of having AI-Free declarations of content as it could / should help to differentiate organic content from generated content. It would be very interesting if companies and site owners wished to self-certify their site as organic with something like an /ai-free.txt.
So now you need something like ChatGPT to cut through the noise?
And what are the consequences for lying?
I'm sure you can appreciate that such an initiative would be wholesale abused from day 1.
If the owner of an SEO site wants to use AI for "content generation", doesn't that mean they didn't care about the human-generated content in the first place?
Seems like a choice between garbage and slightly more expensive garbage. What is interesting or organic about that? Back in the day, people used to put things on their websites because they cared about it and wanted to say it.
To quote Yan LeCun:
Meta will be able to help small businesses promote themselves by automatically producing media that promote a brand, he offered.
"There's something like 12 million shops that advertise on Facebook, and most of them are mom and pop shops, and they just don't have the resources to design a new, nicely designed ad," observed LeCun. "So for them, generative art could help a lot."
https://www.zdnet.com/article/chatgpt-is-not-particularly-in...So the next question has to be: Was this still the right answer?
I've personally had plenty of instances in my programming career where the code I was working on really needed functions which were best shopped out to a common API. To avoid interrupting my flow and to better inform the API I'd be designing for this, I just continued to write as if the API did exist. Then I went on to implement the functions that were needed.
Perhaps the bot was right to presume that there should be an API for this. You might even be able to then prompt ChatGPT to create each of the functions in that API.
I've definitely done this with Copilot, though—it will suggest an API that doesn't actually exist but logically should in order to be consistent, and I'll go create it.
Yes! I remember generating a seemingly reasonable R script except that the library that it called to do most of the work didn't exist! It was like code from an alternate dimension!
Maybe it was? Haha. ChatGPT has seen some things and knows something we don’t!
Every discussion on AI take the example of ChatGPT and its inherent flaws but AI-generated content doesn't have to be dull and low quality.
One question that bother me is does it really matter? If AI-generated content is on par with Human-made or even better does it matter anymore that an AI generated it?
Maybe it's the sentimental value, empathy, fidelity?
If an AI had written Mozart's Requiem would it lessen its interest, its beauty?
Humans get inputs from a large variety of sources, but if an AI's input is just text, then there's the potential for AI's input to mostly consist of its prior output. Iterate this, and its model could gradually diverge from the real world.
The equivalent in human society is groupthink, where members of a subgroup get most of their information from the same subgroup and end up believing weird things. We can counter that by purposely getting inputs from outside of our group. For a text-model AI, this means identifying text that wasn't produced by AI, as the article suggests.
To get away from that we'd have to dramatically change our approach. The LLMs we have are trained on as much content as possible and essentially average out the style of their training data. What it writes reads like a B-grade high school essay because that is what you get when you average all the writing on the internet.
It's not obvious to me that a creative approach that boils down to "pick the most likely next word given the context so far" can avoid sounding bland.
I think it's about intent. Art is interesting and beautiful to us because there is an undeniable human intent in creating it and a vision behind it.
ChatGPT and DALL-E are pretty cool but I think until AI get it's own intent and goals it's pretty fair to try to separate human art and AI art.
Will be a great question when we can't tell the difference.
As we move forward, suspect we will see an increase in curation services and AI models will do more with less. You can bootstrap a productive adult human on an almost infintismal slice of the training sets we are using for the current gen of AI, can't imagine future approaches are going to need such an unbounded input to get better results - but might be wrong!
If content is curated for its quality, whether or not it's AI generated (or assisted) doesn't matter.
If we focused on quality and stopped caring about who gets paid for what I suspect that not only would we have better quality overall but we’d also push the boundaries much faster thus making things even more interesting.
High school students that are prohibited from using AI for their essays will have a bad time. Even if they don't use AI chatbots themselves, they will unknowingly cite sources that were written by AI, or were written by someone who learned about the topic by asking ChatGPT.
Step 1. Release a tool that generates nonsense code across a thousand repositories, and allow anybody to publish crap to GitHub.
Step 2. Copilot trains on those nonsensical repositories because it can’t distinguish them from the real thing.
Step 3. Copilot begins to generate similar crap.
Not necessarily all AI contents are bad and all human contents are good. We need a way to separate the good from the bad, not AI from human, and it might be impossible to do 100% correct anyway.
I am just spitballing. I do not really have a solution in mind. It just sounds like an interesting problem going forward.
No, because models are already trained like that. Datasets for large models are too vast for a human to even know what's inside, let alone label them manually. So instead they are processed (labeled, cropped, etc) by other models, with humans overseeing and tweaking the process. Often it's a chain with several models training each other, bootstrapped from whatever manual data you have, and curated by humans in key points of the process.
So it's actually the opposite - the hybrid bootstrapping approach that combines human curation and ML labeling of bulk low-quality data typically delivers far better results than training on a small but 100% manual dataset.
A great description of what actually happens when you deal with massive datasets. One way to inspect a large dataset is to cluster it, and then look at just a few samples from each cluster, to get an overview.
What if we just crowdsource and have a new Folding@home protein thing but this time it’s for classifying data sets? LAION-5B has 5 billion image text pairs, if we got 10,000 people together that’d just be… 100,000 per person which would take… awhile but not forever. Humans can notice discrepancies super quickly. Like a slide show display the image and the text pair at a speed set by the user, and pause and tweak ones that are outright wrong or shitty.
Boom, refined image set.
Maybe? I’m looking at the LAION-5B example sets on their website and it seems to literally be this simple. A lot of the images seemed pretty poorly tagged. You get a gigantic manually tagged data set, at least for image classification.
I just had my eyes opened reading that, because humans also do exactly that, inadvertently.
Likewise, images can be scored for aesthetics and consistency and models updated to predict and condition in the same way.
After all, we had the issue of millions of auto-generated bad pages in the web 1.0 SEO days. Search engines addressed it by figuring out how to rely more heavily on human behavior signals as an indication of value of data.
If AI writes the bulk of the content, how long will it be before people simply do not put in the work to make sure things are true or put in the work to discover and understand new true things?
One thing that I wonder though is how we will draw the line. If I'm writing a piece and do a Google search, and in that way invoke BERT under the hood, is anything that I write afterwards "AI-tainted"? What about the grammar checker? Or the spot removal tool in photoshop or gimp? Or the AI voice that reads back to me my own article so that I can find prose issues?
And that brings the other problem: do the general public really know the extent of AI use today, never mind in the future?
With all of that out of the way, yes, I would rather read text produced by human beings, not because of its quality--the AI knows, sometimes humans can't help themselves and just keep writing the same thing over and over, specially when it comes to fiction--but just to defend human dominance.
On my first fligh to the US, in the 90's, a rather obese lady in the row in front of me asked the flight attendant: "Excuse me. Do you have fat-free water?"
The flight attendant hesitated a split second, her face not moving an inch. Then she smiled and replied: "We certainly have fat-free water, madam. I fetch you a bottle straight away."
I suppose you wrote the 90's lest our mental image was of a 2020's rather obese lady? The woman probably would be skinny today
Anything you’re consuming on the internet or even on a TV may just be random noise generated by some model so why waste your precious time consuming it?
On the flip side why waste your time producing content if it’s going to be drowned in a sea of garbage mashed together by some language model?
The problem is people don't always make the wise decision. Evidence: the junk food industry is alive and kicking.
Some people will disconnect from devices, but others may just say "this is the way things are now" and adjust themselves to the flavor of junk content.
Just because it can be used by non-experts to create crappy written work, it can also be used by people who work with it to augment and improve their existing writen work.
To my mind AI is a general purpose technology: https://en.wikipedia.org/wiki/General-purpose_technology
I guess using this mental model, what you are worried about is the equivilent to pollution?
Did the printing press also increase the amount of crap in circulation?
"Assume for the sake of argument that conscious beings have existed for some twenty million years: see what strides machines have made in the last thousand! May not the world last twenty million years longer? If so, what will they not in the end become? Is it not safer to nip the mischief in the bud and to forbid them further progress?"
“ He, I know—for the question had been discussed among us long before the Time Machine was made—thought but cheerlessly of the Advancement of Mankind, and saw in the growing pile of civilization only a foolish heaping that must inevitably fall back upon and destroy its makers in the end. If that is so, it remains for us to live as though it were not so.”
That passage has haunted me since. I often wonder if that is the answer to the Fermi paradox. Civilization might be but a brief spark in the long night, separated from others by both time and distance insurmountable.
kicks big pile of old books
“it remains for us to live as though it were not so” is a wonderful line.
1. Before 1970
2. By someone who has consumed AI-generate content
3. With a ~2000-era spellchecker
4. By someone using ~2020-era neural speech-to-text software
5. With a ~2020-era spellchecker
6. By someone with an AI in the traditional editor role (reads it, gives detailed feedback)
7. By a human and an AI working together where the AI generates everything initially but the human fixes things and nothing goes out without human endorsement.
I'd probably draw the line at 7, but you could also argue for 6 or even 5.
The history of the term "handmade" and discussions about what to allow on Etsy come to mind: https://whileshenaps.com/2013/10/etsy-redefines-handmade-aut...
Then I asked people to guess the authorship. Amazingly, only 70% of the time the guess people make is correct. https://random-poem.com/
why : Is this Poem written by or by ? Guess & Click.
I'm guessing it will get even harder to tell as the AI improves further down the road.
"ridicolously"
--edit
.. weird:
I asked ChatGPT if there is a typo in this poem. This is what it responded with:
Me: Does this poem have a typo? https://random-poem.com/weird
ChatGPT: It appears that there is an intentional typo in the first line of the poem "Weird Al, oh Weird Yankovic Al" instead of "Weird Al Yankovic". Yankovic being the surname of the artist, this addition can be seen as a playful and humorous way to refer to the artist, and give the poem a personal touch.
LLMs have given us a more interesting corridor in the Library of Babel - https://en.m.wikipedia.org/wiki/The_Library_of_Babel - but choosing the wheat from the chaff will still be the human endeavor because of the infinite possible BS.
I'm not sure how to best describe it, but every time I interact with AI, there is very little emotional response from it. Rather it's a "good enough" response rather than a sense of awe or horror.
I've been experimenting with writing recently and wanting to provide some AI imagery to match the emotions I'm expressing. A painting like "Wanderer above the Sea of Fog" evokes many emotions. But when I use the same description such as:
"a man standing upon a rocky precipice with his back to the viewer; he is gazing out on a landscape covered in a thick sea of fog through which other ridges, trees, and mountains pierce, which stretches out into the distance indefinitely."
I get the store-brand version that doesn't invoke any emotion. It is "good enough" to get the point across, but lacking the response. Similar to the countless recreations of the Mona Lisa, there is just something about organic perfection. I'm sure AI will get there one day, but who knows if we will react to it in this sense of wonderment.
We probably can still figure out if a painting is original or AI-generated but I don't think we can much longer as AI improve.
The question would be could we feel emotions even if the source material is artificial. I think the answer is yes. Human brain is can easily be tricked.
So more to the human struggle of the production. At least for me.
Might want to look into better prompt engineering. This is pretty generic and there are better prompts you can use.
Why we use text? Half of it is about getting something from someone else BUT even more importantly we write text to change something in the worlds using our words.
The problem with AI pretending being a human only exist when that AI doesn't get anything from us and it's only good for extracting information from.
It's utterly futile to discuss with AI generated content here on HN but it's amazing experience on ChatGPT precisely because when we write each other something here on HN, we expect that our words will create some impact somewhere. Some option will change or we will befriend someone.
I have 0 problems with having interacting with AI which is an individual machine in Lisbon, and it is learning and evolving as the life happens. On the other hand, I hate the idea that I'm talking with a machine in Palo Alto and the only output of my conversation is some statistics that VC will gaze over and optimize for his own gain.
Just think about it how meaning is transferred from human to human: we can compress meaning in few markings and extract it on the other side only because as individuals we have experienced life and some markings saying "bored" is enough to transfer very complex situation from person to person. In the current state of the AI, being bored doesn't mean anything to LLMs but if individual AI machines lived human-like lives, I think it will start having meaning for them too.
IMHO, the problem with AI generated content currently is exactly the same with SPAM or other non-genuine content and has nothing to do with it's origins being biological or electronic.
We people will put eyes on a ball and call it our friend, we don't have problem with it.
So, the article's underlying philosophy is "not there yet". It does not adequately address various real world challenges.
1. How is AI-generated content different from algorithms? I'd suggest drawing a line may be nonsensical.
2. What is the precise ethical motivation for wanting to avoid AI-influenced computation? I don't see a compelling case.
Examples:
A. Do we want civil engineers to use optimization software? Yes.
B. Do we like spelling and grammar checkers? Yes.
C. Do we want content generation software to suggest topics and hyperlinks? Yes.
D. Do we want to try out AI music? Yes. And we want to remix it.
E. Do we want to improve our health by making it more accessible and affordable? Yes.
If our goal include protecting human rights, health, dignity, and so on, we better darn well formulate our philosophy and policy goals in a coherent way.
[edit] Actually, just to expand on this a bit, this is effectively an argument for establishing a mark of authenticity that all literate civilizations have always striven to place on their intellectual output. There has never been a single rule uniting these efforts, but civilizations which placed more emphasis on safeguarding and defending the precision of the written word have tended to be rewarded with greater longevity. There's no reason that trend shouldn't accelerate when faced with the threat of inundation by meaningless language models. Just like gold coinage, we're looking at a period of debasement and inflation.
You could actually argue that language models themselves are an expression of anti-semitism, in the sense that they're an attempt to undermine the sacredness of the written word, to destroy or wash out the way that the meaning of words ennobles humanity, and to eradicate the special relationship that the law of language and the language of law create between God and Man. I only say it seems anti-semitic because that particular concept, as a high/sacred value, seems unique to Judaism (from my perspective, I can't think of another culture that considers it an inviolable precept) and so this attempted abolishment of the human hand in the written word seems particularly targeted at those who consider the word sacred; maybe this is yet to be threshed out. Maybe Bari Weiss will write about it once some nazis have ChatGPT come up with a totally bunk but plausible corruption of the Talmud. But love of the written word is something that should rightly be a general human value, because we'll live or die with it, Jews and everyone else, whether we want to or not. All civilizations fall when their coin is debased, and our coin today is information.
[edit2] also, I'm drunk, and I love y'all. I hope this stimulates debate, not hate.
In practical sense, as a society, we do need something to separate good from bad. Technopriest cast does sound like a fun job description to me ( which will inevitably be corrupted and result in its own schism ).
Future scares me.
Pure, certified 100% AI generated content. No humans or other animals were directly exploited for the purposes of generating this content.
"It is by caffeine alone I set my mind in motion. It is by the juice of Java that thoughts acquire speed, the teeth acquire stains, the stains become a warning. It is by will alone I set my mind in motion."
good that automattic is setting this policy for themselves, but as long as this is opt-in there is zero chance this fantasy will be reality
also @simonw’s “AI veganism”: https://news.ycombinator.com/item?id=32639643 its a good term because it will be about as popular and have as loud and passionate and few adherents as actual veganism
To clarify, this not a policy but a design choice and an intent of one PM :D I cannot speak for the actual company policy because we are at an exploration phase.
> also @simonw’s “AI veganism”: https://news.ycombinator.com/item?id=32639643 its a good term because it will be about as popular and have as loud and passionate and few adherents as actual veganism
Diets are such a rich metaphor for the approach to content! Fast food on the consumption side, veganism, kosher, paleo etc. on the production side
While this is undoubtedly true and a problem that needs to be addressed, it's worth considering that humans gets things factually wrong too sometimes (intentionally or not). So perhaps a more interesting question is how much more or less correct an AI is than a human on a given task.
People talk about this with self-driving cars all the time. Arguably a self-driving car does not need to drive perfectly (not that this is not a good goal), but if it can drive more safely than the average human driver, there's still a significant chance of improving overall road safety.
It matters what we consume, plenty of humans peddle rotten ideas and false narratives. With AI, I'd look into the models and the intentions of those who created/run it. What sort of content it delivers. Just as it makes sense with human authors to look up their biography and publishing history.
AI content will be like modern pop songs and super hero movies. They fit a formula and people will still consider them “great” because the bar has been lowered by algorithms and reduced appetite for risk. but deep down we know they are not in the same league as works from The Beatles, Sting, or Martin Scorsese.
The reasons these questions are so fun to consider is because we know the answers exist, even if we also know we shall never know them. By contrast had this image just been something output by yet another neural network generator, there would be little to no interest of the same sort, as you would know that anything you infer beyond the most surface level is exactly the same as seeing shapes in the clouds.
[1] - https://upload.wikimedia.org/wikipedia/commons/b/b1/Mona_Lis...
Meanwhile you never get any results from 4chan or KiwiFarms (sites with much more organic content), unless you go out of your way to specifically ask for it.
This didn't work for PGP because people in general don't care about that. And I think people in general don't care if their content is AI-generated or not.
It's not like all human-generated content on the web is tremendously accurate or well-written. Hell, maybe the AI will even be better. :)
https://techcrunch.com/2022/12/10/openais-attempts-to-waterm...
When you see something for yourself, that delivers one kind of knowledge.
When you hear it from a friend, that delivers another kind of knowledge.
When you hear it from somebody you don't know, and never actually met, second or third hand, that delivers yet another kind of knowledge.
And, in our society, that last kind of knowledge is so often treated as equivalent to the first. Which is messed up.
Is my spell or grammar check an AI? No, why not? What if I translate a word from another language while writing, using an automatic translation tool? What about the algorithms that boost (or don't boost) my post through various networks, and pick an audience for it? And the analytics tools I use? etc. etc. And all of this will just become murkier and murkier.
"AI" is itself a problematic term, since as many of us know many of the things so-marketed are little more than a bundle of heuristics or a complicated statistical model. And many of the "authoring" tools generating texts or images or code are really just copy-and-pasting-and-mutating things without a lot of magic in-between, but in clever ways that "trick" us into seeing it as original authorship. (Ok us humans do this too)
Perhaps the most reasonable dividing line is: is there a machine here that is emulating a human, pretending to be one? That is perhaps the thing that bothers me most about the recent wave with ChatGPT, Google's Meena etc -- the authors of these systems didn't have to create systems that presented with a human-like identity with pronouns and the illusion of selfhood. But they did, and now we have the Lemoine incident etc.
Reminds me very much of Dune, and its "Orange Catholic Bible"'s “Thou shalt not make a machine in the likeness of a human mind.” I think Herbert was onto something -- the question of whether something is truly human (or human created) or not might become the key important issue of our time and the ambiguity and confusion around that is going to become troublesome.
I feel like a similar thing already happened with YouTube recommendations
Have you googled for reviews on toaster ovens recently?
Sadly, word of mouth again becomes defacto the only semi-reliable way to separate crap from non-crap and even that comes with its own set of issues and attempts at gaming.
We could get this whole discussion back to some semblance of sanity if we stopped calling any form of remotely complicated automation "AI". The term might as well be meaningless now.
Nothing about any of all these "AIs" is intelligent in the sense of the layman's understanding of artificial intelligence, let alone intelligence of biological and philosophical schools of thought.
Frankly I had the same thought writing it :D
It's more of a stake in the ground sort of a thing I guess? What I really want is somebody saying "hey, there is an open standard already here" so I can use it.
Right now we seem to be at the stage of "It's just McDonald's/KFC for data - junk food is convenient, cheap and not a problem - therefore mass production generative content won't be a problem".
The food analogy is powerful, but has limits, and I urge you to dig into Digital Vegan [2] if you want to take it further.
>And that brings the other problem: do the general public really know the extent of AI use today, never mind in the future?
The line is drawn at human ownership/responsibility. A piece of content can be 'AI tainted' or '100% produced by AI', what makes the difference is if a human takes the responsibility of the end product or not.
The humans running those processes can attempt to deny ownership or responsibility if they so choose but whenever it matters such as in law or any other arena dealing with liability or ownership rights, the humans will be made to own the responsibility.
Same as for self-driving cars. We can debate about who the operator is and to what extent the manufacturers, the occupants, or the owners are responsible for whether the car causes harm but we'll never try to punish the car while calling all humans involved blameless. The point of holding people responsible for outcomes and actions is to drive meaningful change in human behaviors in order to reduce harms and encourage human flourishing.
In terms of ownership and intellectual property, again the point of even having rules is to manage interactions between humans so we can behave civilly towards each other. There can be no meaningful category of content produced "100%" by AI unless AI become persons under the law or as considered by most humans.
If an AI system can ever truly produce content on its own volition, without any human action taken to make that specific thing happen, then that system would be a rational actor on par with other persons and we'll probably begin the debate over whether AI systems should be treated as people in society and under the law. That may even be a new category distinct from human persons such as it is with the concept of corporate persons.
One could make a strong argument that defending moral principles is preferable to preferring the underlying creative force to have a particular biological composition.
As an example, I don't want a system to incentivize humans kept as almost-slaves to retype AI generated content.
How can one tell the difference between all the gradations of "completely" human generated to not?
Those pushing the concept of "AI-Free" have yet to nail down what amounts and types of automation may be used in the production of a thing before it can no longer be labeled "AI-Free".
On the other hand, "Organic" is very well defined at this point. In the US there is a whole legal framework around labeling foods, drugs, and cosmetics.
https://www.fda.gov/food/food-labeling-nutrition/organic-foo...
https://www.ams.usda.gov/about-ams/programs-offices/national...
There is even a training and accreditation system in place for qualifying people as certifiers of organic practices as well as an application and review process for farms and manufacturers to be certified to use the label.
https://www.ams.usda.gov/services/organic-certification/beco...
Would people pay millions of dollars for an AI generated black square NFT? Maybe some would too, it really shows the retardation of the entire space. With the mainstream it's not even about art and craftsmanship but about big names and make-believe. And that's not just art but all fields including science to a degree. When we get AI, hopefully more people are going to start focusing on substance instead of marketing and illusion.
I also think you're misunderstanding how Google prioritizes content. It's not showing the content people want, as much as content that's optimized with SEO to look appealing.
KF is a gossip site, like Lipstick Alley or DataLounge, but with a focus on e-celebs and internet phenomena. I go there whenever I want to get the dirt (ie. truth) about e-celebs or some weird internet trend or subculture. Its users literally just passively document and discuss this stuff, and there's a policy (and ethos) of "no touch." You and others endorsing its erasure (despite it not breaking any law) would know this (and perhaps secretly do) if you actually lurked there for a while instead of just credulously accepting its critics' characterization. It's also strange how the site is supposedly such a hive of doxxers, harassers, etc., yet nothing seems to happen to the journalists and tech people organizing against it.
I think the real reason it's in hot water is precisely because they just passively, permanently document so much, some of which some people with money and influence don't want documented. Maybe they documented something someone (some billionaire?) secretly doesn't want out there?
Google isn't prioritizing AI content, it's just losing the battle against it. Not the same thing.
Additionally, image submissions are basically never described in the text, so they are unsearchable even when they are live. There's some exceptions with archives but now you're in power-user territory.
Not being compatible with search engines is a side-effect that its userbase actually likes. You are either currently there or you are not.
There are archives, like rebeccablacktech and 4plebs, which Google likewise blackholes. You could argue Google also does not care much for archives, yet I still get StackOverflow clone results, for some reason.
Source: Bitter old SRE who has had to fix many broken software patterns ripped out of StackOverflow and the like.
In fact, that is basically the GAN algorithm in meta form.
Both of these are fascinating questions and, to me anyway, both can be answered with yes. The sheer amount of writing increased exponentially once more people could read, write and publish their own writings ( and internet only exacerbated this trend ). I accordance with pareto principle, most of it was of poor quality, but the upside was that good output likely did increase in terms of absolute number as well ( few people are bound to write something decent ).
I think parent is looking back at history and reasonably infers potential results ( more crap ).
I think the answer is, undoubtedly, yes.
The scarcity mindset in an the era of information abundance isn't being perpetuated by capitalism, it's being perpetuated by people making their own decisions one transaction at a time.
Markets in our current system are (IMO) a distorted view of what people want. For example TVs: How much of the market is people who simply want a nice clean picture and how much is people who are feature-chasing based on the TV marketing? And then how much of the 'market' is actually smoke and mirrors accounting to increase the value of companies?
Some things that I think are only common because of capitalism (or "capitalism") and could therefore be phased out if we transitioned to an AI-supported system:
- "Keeping up with the Jones'"
- Needing short hits of artificial 'excitement' (such as the kind of novelty drip-fed to us by the movie industry)
- Keeping ideas secret as a form of Intellectual Property protection, and therefore the scarcity mindset that there are no good solutions and no new problems to solve (since we'd all be exposed to the latest of those if we chose)
- Capital being used for poor-quality projects (since many of these again are just about trying to grab more money for the people making them)
- People making their own decisions one transaction at a time. This is definitely underrated. Sure some people will continue to think short-term just because that's how their brains are wired, but I suspect the majority will find themselves able to breathe slowly for the first time and they will naturally unwind in to longer and longer-term thinking.
"BRRAAALFF!!"
Perhaps it could be useful if they train the bot to identify cases like this and state that no such references exist but also provide a thesis or suggest a line of study that would produce such a reference.
However, as far as I know, the OpenAI team has not made it a goal to have ChatGPT only produce functional code using existing APIs. So I'm not sure we can call that an incorrect answer based on context.
If the API it demonstrated using logically should exist, it seems like the right answer is still to just go create it.
The difference I was drawing between ChatGPT and Copilot wasn't that Copilot has functionality ChatGPT doesn't, it's that it has context ChatGPT doesn't, so it suggests things related to internal APIs. In a conversation with ChatGPT it would be very difficult to get help with internal APIs, hence my assumption that OP wasn't referring to APIs they have any control over.
While I can easily agree that Copilot is probably the better tool for such questions, it is not clear from the parent comment whether their prompt to ChatGPT was asking to create code to do X or create code to do X using only existing publicly available libraries.
It's not immediately obvious that the bot failed to understand the question or that the answer was an example of the bot failing to understand the programming language. It could easily be that the user had an implied expectation of usability in a context they did not give to the bot.
That scenario is more like you asking a random person on the street, who happens to know Y framework, how to do X in that framework. Your coworker can be expected to get that you are looking for an answer that gets your current task done faster than you would be able to do without their assistance. The person on the street could not reasonably be expected to get that unless you give them that context.
I once employed a journalist to write about the pros and cons of wedding insurance. Just to give you a clue how long ago this was, it was a unique article at the time.
Many years years later, every article you will read about wedding insurance (there will be many thousands) is around 90% similar in style and content to the one I paid for.
I dare say you could use any other topic as an example, one thoroughly researched original and many thousand similar copies. I can't see how ChatGPT is not going to make this situation much worse?
It'll tell you what the pros and cons of wedding insurance are, and because eventually it'll have access to your calendar, it'll tailor that answers to the specifics of the fact that you're having a destination wedding during monsoon season in the area.
Once this kind of AI search becomes the default way we look for information, there won't be a point to creating SEO spam anymore. It'll create other problems, of course, but that's the way it goes with new technology.
I think this is probably true for some people, the same sort of person who sees something on Facebook and assumes that it's true. [1] But there are quite a lot of people for whom "according to whom?" is the next question after being told something factual. For them, I think search's job is to find relevant sources and get out of the way.
But I think even finding out is a long way away. The main thing that ChatGPT has nailed is glibness. It produces text that sounds authoritative, whether or not it's correct. And it's often incorrect. People may try ChatGPT search out of novelty or because it feels human. But if they depend on it and feel the real-world impact of a confidently wrong answer, they're going to treat it as a human that's untrustworthy. A blowhard, a liar, a fool. So I'm sure the major search players are going to be very cautious rolling out chat-like things. Google has spent decades building up consumer trust, and the don't need a zillion articles about people who a too-confident chat steered wrong.
[1] E.g., That men in white vans are kidnappers: vhttps://www.cnn.com/2019/12/04/tech/facebook-white-vans/inde...
What about creating new, relevant, interesting content that no-one will ever see because search no-longer exists? Will site owners continue to do it knowing AI will crawl it, and never send traffic? Probably....not?
How do LLMs of the future get better if website owners are no longer incentivized to create content?
Variation on the Turning Test: prove that it's not a human claiming to be a computer.
Modeling premises and Meta-analysis are again necessary elements for critical reasoning about Sources and Methods and superpositions of Ignorance and Malice.
At the end of the day they just want to optimize search results. And the Overton window of acceptability currently allows "human generated SEO content" but not "AI generated SEO content". It's just an arbitrary rule.
For that matter, nature can obviously be both, and it doesn't have intentional design nor vision behind it. So it's clear that it's not a universal requirement.
What I meant is that something that is "interesting" and/or "beautiful" is just artistic, such as nature as you pointed out. For it to be art, there has to be intent behind it, otherwise it's just aesthetically pleasing.
My point was that art is more than just something that's aesthetically pleasing.
But if we're talking about definitions, surely what really matters is how most in society understand "art"? Now suppose we went around showing Midjourney-generated pics to random people on the streets and asking them whether it's art or not; how many do you think would say "no", or ask questions about artist's intent before giving an answer?
We can expect a bot like this to not really get context clues in natural language, although they seem to be getting better at that, but context is not necessary to have a true and functional understanding of a programming language. That was the point of creating such languages.
Using an API that doesn't exist but logically should once the use cases are demonstrated is not an example of lacking understanding, it is an example of advanced insight. A human might have invented the necessary functions inline with the rest of the project but if they are expressing functionality that is commonly applicable, then a common API for those functions is what the humans would eventually converge upon to clean up the code from the initial inline implementation, making it more consistent and readable.
On the other hand, one of the other posters asked “to generate a parallax effect in Qt/QML. It simply used a QML Elemened with the name Parallax”. Is this an insight, or is this answering “yes, I could” to “could you pass me the salt?”. Maybe the line between the two is a fine one, and I didn’t realize that yet.
In general, it feels like copying part of the question (“write parallax code”) in the answer is the easy part of the task…
To me there does seem to be some nuance to here that's worth noticing. Some examples of this type of response are indeed too cheap and can be chalked up to lack of training data or something.
But in other cases it's actually not immediately obvious whether the answer the user got was their fault for not specifying that they are expecting code that works without additional supporting libraries.
A language model can't reasonably be expected to understand an expectation of usability or fitness for purpose in a context the user didn't specify.
The user was implicitly expecting code that would function when executed immediately and as written with no additional supporting libraries included. This is different from code that would function correctly when executed after having downloaded relevant existing packages. Which is different from code that would function if executed along side additional supporting code from private libraries the user might not have access to. Etc...
Yet any of those answers fit for the same prompt, "create an R script that does such and such". The bot's lack of insight is on the likely intention behind the prompt rather than on the requested language. I'd say if it produces any code that fits the syntax and grammatical structure of the requested language, that's enough to say it understands the language.
AI is often limited in ways we aren't, but it also trivially consumes more than we can in a lifetime.
"Just text" in the case of GPT-3, but also it is trained on a token count exceeding the number of times an average synapse in a human brain will fire in a lifetime.
It can still get biases from the training set; while I'm not sure if "group think" is quite the right phrase, it does seem to "want" everyone to get along even when asking it to create multiple characters engaged in a conflict. (Or perhaps that's just an artefact of it estimating that I want that). Reminds me — in a bad way — of Jules Verne's From The Earth to the Moon.
Humans are more grounded by having a presence in the physical world. Plus they draw on various sources considered more reliable, like formal training, scientific papers, textbooks, quality journalism, etc. If we want AI to be reliable, we'll need it to put the most weight on similar sources, and maybe even have some real-world presence with sensors and robots.
Eventually AI will be able to produce new reliable information itself. But for that, it would have to recognize factual inconsistencies between sources and logical inconsistencies in arguments, and figure out how to resolve those, and do math correctly. I don't know what the state of the art is here, though ChatGPT tends to fail at basic arithmetic.
While I think I get your point, I'd call that failure mode "believing it's own BS", and (perhaps I'm just being cynical here) I think humans collectively also have this failure mode.
That said, there's an old saying: "To err is human, to really foul up requires a computer" — it is quite possible for a machine, with merely the same category of flaws we have and no others, to be really bad for the world just because it's really fast and doesn't sleep.
There are already AIs that take input via image, video, and audio. The AI tech is input agnostic and only requires that someone figures out a way to get the input in.
Distributed Version Control > Work model > Pull Request: https://en.wikipedia.org/wiki/Distributed_version_control#Pu...
sindresorhus/awesome: https://github.com/sindresorhus/awesome#contents
bayandin/awesome-awesomeness: https://github.com/bayandin/awesome-awesomeness
"Help compare Comment and Annotation services: moderation, spam, notifications, configurability" https://github.com/executablebooks/meta/discussions/102
Re: fact checks, schema.org/ClaimReview, W3C Verifiable Claims, W3C Verifiable News & Epistemology: https://news.ycombinator.com/item?id=15529140
W3C Web Annotations could contain (cryptographically-signed (optionally with a W3C DID)) Verifiable Claims; comments with signed Linked Data
I just think that until AI has real intent, it's a different thing that what we do. Which is not saying anything bad it, quite the contrary.
The definitions will inevitably shift over time, they're already ultimately subjective. If you believe in god, it's not hard to argue that nature is art and quite a lot of people believe in god.
> A language model can't reasonably be expected to understand an expectation of usability or fitness for purpose in a context the user didn't specify.
I agree, but I think we’re at the same time expecting the LM to “understand” a lot more
> With a person, you can slowly get to the bottom of it and develop a relationship.
With this kind of content (with most content on the internet, I'd argue), you really can't.
For now HN is a bastion of conversation across many countries and domains
Todays programmers can see copilot output and probably think “well that’s not optimal”. Fast forward five years, new CS grads are using Copilot 3.0, and are used to specific auto-completes that copilot gives for certain tasks, as they may have never needed to go beyond some of the more basic suggestions.
It “feels” like an older programmer seeing a younger web dev and going “you’re wasting MB of memory!”
While true the web has gotten slower in many regards, and indeed memory may have been wasted, business value creation typically doesn’t care if a few MB is sub optimally wasted, while the previous generation does.
The airline industry has talked about this, of course, and adoption of robotic surgery has opened up a whole new training problem, because its escape hatches when it goes wrong or can't complete a procedure is often "complete the surgery manually". Which is fine on Day 1 of robotic surgery, but what about day 2 when surgeons typically don't have hundreds of similar procedures under their belts? And where the only time they're called on to exercise the skill is in difficult edge cases?
We have basically turned driving a standard transmission into a weird old person quirk or niche enthusiant skill in the United States. If an automatic transmission required a similar manual fallback or check, how well would that work? Well, it would work fine if basically everyone already had a lot of practice driving a manual--but now? It wouldn't work well at all. Of course, automatic transmission don't fail like that, and are a lot better at switching gears than AI assistants are at generating code. I worry about the semi-automated approach to self-driving, where the driver may not actually have currency with their driving skill, and where--in the instance that it's necessary--a driver has to react to more complicated situations (they don't have practice with the simple ones, and they have to react not to a hazard but to their car's failure to react to a hazard).
I crammed a gigabyte of ASCII into a string in FreePascal, it was glorious!
Waste all the MB you want!
(I'm 59)
Can =/= Will
And as also pointed out by others, it not only requires effort, but knowledge, and that knowledge will be systematically degraded the more AI-ish code generation is used.
OTOH, those with super-diligent hacker attitudes will start to learn how to find the flaws in generated code and optimize it, so leveraging the tool, but most will just move on to the next task/ticket as soon as the AIish code passes the unit tests. So, super-leveraging AI-generated code will be rare.
How is that different than the plague of junior devs we've always had? Devs will get more senior by identifying and correcting issues in AI code rather than code from their peers. Seems OK, like we just got a whole lot more coding capacity.
If you write a good comment describing name, inputs, outputs, logic, and exceptions, the "generate code from comment" capability is kind of amazing. I'm a terrible, hacky programmer, and it has wholly converted me to documentation-first.
I can see this go both ways:
boilerplate code being a great generic solution to a set of problems, but a more seasoned programmer may say “that works, but for our use case the trade offs don’t make sense”
Or alternatively, “this code wasn’t something I knew I could do in language X, and it’s far more efficient”
Perfect example I found a while back: If you know Chinese or Japanese, ask ChatGPT for the stroke order of a certain character and watch how confidently it tells you how to draw a nonsensical scribble.
Even when you ask it for the stroke order of 一, it will tell you to draw a vertical line!
Maybe that's good. There are glib liars on the net, and not all of them are ChatGPT. If people learn to be skeptical of fine-sounding content on the net, maybe they'll apply it to humans, too.
The most basic technique for estimating trust in an answer today is to check the sources, see who said it, why, who is agreeing with them, etc.
It some AI just spits out an answer without any references, you cannot do that. You either have to blindly trust the answer, which will be dangerous, or you'll have to blindly distrust the answer, at which point the AI will be useless.
Step 2: Sources? Check them. Do they exist at all? Do they say what the thing in question says they say?
Step 3: Check for other (non-cited) sources that confirm what the cited sources say. Check for other sources that dispute it.
0: https://www.alignmentforum.org/posts/6Fpvch8RR29qLEWNH/chinc...
Of course, this is only true if new data is distinct from old data. If you just copied your data set 10x and pretended it was a 10x larger data set, it would behave like you expect.
Keep in mind though, language isn't a stationary process.
Note that when plotting corpus size vs unique words, the log plot is expected to be linear.
I think you argument is that this means smaller datasets are ok because they contain "most" of what the larger datasets contains. But I think this data-power-rule implies the opposite. ML models can often get to 80-90% accuracy on some task. Unfortunately, these models often aren't that useful because that missing 10% of accuracy matters a lot to users. So what this data-power-rule implies is that, in order to get the last 10% of gains, you need 10x the amount of data.