219 points
15 days ago 239 comments reply
15 days ago 56 comments reply

It'll be interesting to see where this goes. Amazon has had ML-generated garbage books for years now, and I assume they haven't taken them down because they make money even when they sell garbage.

Maybe there's so much garbage coming in now that they finally have to do something about it? I feel for people trying to learn about technical topics, who aren't aware enough of this issue to avoid buying ML-generated books with high ratings from fake reviews. The intro programming market is full of these scam books.

15 days ago 31 comments reply

I was thinking about buying an air fryer. My search came up with cookbooks specific to that air fryer, and I was intrigued. I found a good 5-star book, but then I found that ALL the 5-star reviews were submitted the same day.

I complained, but Amazon defended the book as legitimate, and since I hadn't purchased it, they would not take any action. (to be honest, I assume frontline customer service reps don't have much experience or power)

So I purchased it, complained, got a refund and then they were able to accept my complaint (after passing the complaint higher in the food chain).

Seriously, how hard was it amazon? I guess they're starting to notice.

Take a look at air fryer cookbooks - there are books specific to most makes and models. But everything is ML copypasta all the way up and down - the title, the recipes and the reviews all seem to be generated garbage.

15 days ago 19 comments reply

I'm the author of Python Crash Course, the best selling introductory Python book for some time now. Years ago, someone put out a book listing two authors: Mark Matthes and Eric Lutz. That's just a simple juxtaposition of my name and Mark Lutz, the author of O'Reilly's Learning Python. The subtitle is obviously taken from my book's subtitle as well. I assume the text is an ML-generated mess, but I haven't bought a copy to verify that.

I used to comment on reviews for books like these explaining what was happening, but Amazon turned off the ability to comment on reviews a long time ago.

I've spoken with other tech authors, and almost all of us get emails from people new to programming who have bought these kinds of books. If you're an experienced programmer, you probably know how to recognize a legitimate technical book. But people who are just starting to learn their first language don't always know what to look for. This is squarely on Amazon; they have blocked most or all of the channels for people to directly call out bad products, and they have allowed fake reviews to flourish and drown out authentic reviews.

15 days ago 2 comments reply

I think the best way to recognize a legitimate tech book is... visit a Barnes and Noble. If it's a publisher or series you can find printed on the shelf, books are legit.

Unfortunately online market "platforms" are pretty much widely untrustworthy for any sort of informational purposes.

15 days ago 0 comments reply

If it's a publisher or series you can find printed on the shelf, books are legit.

Not even that is a guarantee, there have been cases of rip-offs making it through a bunch of book-on-demand services.

All "marketplaces" allowing third parties unlimited, unmonitored access to product listings suffer from that issue.

15 days ago 0 comments reply

also, just doing your research on any platform other than Amazon helps.

15 days ago 11 comments reply

Why don't beginners start at Python.org, though? It's such a great resource to learn the language.

- it's free, unlike books

- always up-to-date, unlike even the best book after a few months

- easy to choose: heck, there's only one official documentation! No chance of making a mistake here!

15 days ago 1 comments reply

Many beginners do start at python.org. However, if you don't know anything about programming, and you don't know someone who can answer all the little questions that come up, it's really hard to learn from documentation alone. Even the official Python tutorial is fairly inaccessible to many people who are trying learn a language for the first time.

Almost every Python author I've spoken with recognizes that no one resource works best for everyone. We each write to offer our particular take on a subject, and hope to find an audience that our perspective resonates with. I've never steered people away from documentation; in fact one of my goals is to steer people to the sections of documentation that they're ready to make sense of. One of my end goals is that people no longer need me as a teacher. That was my goal as a classroom teacher, and it's one of my goals as an author.

The idea that there are no mistakes in official documentation is pretty unrealistic. Technical documentation has certainly improved over the last decade or so, but it will never be perfect. Most of us recognize that some areas of programming are better handled by third party libraries. In a similar way, there will always be room for learning resources that are maintained outside of official documentation sources.

15 days ago 0 comments reply

I didn't claim the official docs have no mistakes.

Since there's only one documentation, beginners can't get wrong with which docs to use.

Ad opposed to books, which have tons of bad choices available (hence the current discussion).

15 days ago 5 comments reply

Are you suggesting people just go read the documentation like an encyclopedia? I don’t know a single person who got their start programming by doing that - just about everyone wants some sort of guide to help lead them in good directions.

15 days ago 0 comments reply

I did. On Windows, Python had (still have?) a good offline help. And it included a nice getting started tutorial. The only book I had was “The C Programming Language”. But they ignited my interest enough to start researching, and I landed on the "Site du Zero" (now OpenClassrooms) platform. The web was sparser, but better, in these days (2010).

15 days ago 3 comments reply

That's more or less exactly how I learned to program. From books, with a few friends. Only after it got to a certain level and I started frequenting more places where we met other people working with computers some of which were professional programmers.

I still have some of them. They've aged surprisingly well.

15 days ago 2 comments reply
15 days ago 1 comments reply

Do the official docs even have tutorials? I'd send beginners to Khan Academy instead.

15 days ago 0 comments reply

Yeah, https://docs.python.org/3/tutorial/index.html. But I would say it is good for those who already know another programming language and not for complete beginners.

15 days ago 0 comments reply

I guess book authors don't like my perspective...

15 days ago 3 comments reply

i stopped frequenting the dev.to community because the average quality of articles just got so low it stopped being worth my time

15 days ago 1 comments reply

dev.to is blocked on HN for this reason (try submitting a dev.to link; it won't appear under New.)

There's an old thread where dang explains that it's blacklisted (along with many many other sites) due to the consistently poor article quality.

14 days ago 0 comments reply

try submitting a dev.to link; it won't appear under New.

I think you'll see it if you're logged in and have showdead turned on.

15 days ago 0 comments reply

Conversely if you post something sophisticated there it will likely bomb. A bunch of emojis and explaining JS closures for the hundredth time. Does well!

15 days ago 6 comments reply

Ugh. I hate the, "You're not a customer yet so our CRM system won't let me talk to you."

And what happens when my problem is that your system won't let me place an order?

15 days ago 0 comments reply

False Negatives and False Positives are always connected. On the other side of the equation, there are plenty of bad actors who will casually flag their competitors to score a quick win. Crime doesn't like to go uphill - raising the stakes for feedback lowers the prevalence of bad actors.

15 days ago 1 comments reply

I think that's a different issue. Amazon has thorny problems with takedowns. Company A trying to get rival company B's listing taken down probably happens 100's of times a day. I believe Amazon uses "proof of purchase" kinda like a CAPTCHA or proof of work - an extra hoop to jump through to reduce the volume of these things they have to adjudicate.

15 days ago 0 comments reply

It should be a term of service that you’re not allowed to interfere with other customer’s listings.

If I found out one of the tenants on my multi tenant system was trying to mess with another’s, I would be livid.

15 days ago 2 comments reply

CRM should never mean Sales Prevention as a Service.

15 days ago 0 comments reply

The great thing about filtering is that you don't have to hear the screams.

These accidents play out in slow motion until someone corners you at a family reunion and asks why their friends can't create accounts and when you ask them how long they say "months".

15 days ago 0 comments reply

You'd think... but in a growing b2b company, the CRM is where sales get prevented under a certain threshold. heh.

15 days ago 1 comments reply

Seriously, how hard was it amazon? I guess they're starting to notice.

It's not hard. It's a cost center, and they're in the business of making money - not providing the best service.

15 days ago 0 comments reply

They're biggest risk has always been the perception they peddle fraudulent simulacrums of worthy products.

15 days ago 0 comments reply

It’s the same across all big tech. The size/volume for complaint handling doesn’t scale. It’s either filtered out by some machine learning algorithm or some poor person in a 3rd world country getting paid next to nothing who reviews the complaints so quality isn’t of importance.

There been a recent influx of scammers on Facebook local groups. Air con cleaning, car valeting, everyone’s calling out the scammers in the comments yet when you click report to FB the response is we have reviewed the post and it has not breached our guidelines, would you like to block the user.

15 days ago 0 comments reply

If I don't get where I want to be with the front door customer service within a decent amount of time, I have always had good success contacting [email protected]. Their executive support team gets back quickly via email or phone and they really seem to care.

15 days ago 1 comments reply

Garbage books are used for money laundering.

You buy books using stolen credit cards and such.

https://www.theguardian.com/books/2018/apr/27/fake-books-sol...

15 days ago 0 comments reply

I wonder if that means the Feds made a phone call to Jeff on his private line and said we need to have a little chat.

We can track money laundering when there are X fake books. We can't when there are 10X fake books.

15 days ago 6 comments reply

I knew guy who made "generated" text books in 2010. He would absorb several articles, and loosely stitch them into chapters with some computer scripts and from memory. In a week he would produce 400 pages on new subject. It was mostly coherent and factual (it kept references). Usually it was the only book on market about given subject (like rare disease).

Current auto generated garbage is very different.

15 days ago 2 comments reply

I wouldn't even consider that generated. That's like where useful content and copyright infringement overlap on a Venn diagram.

15 days ago 1 comments reply

That's like where useful content and copyright infringement overlap on a Venn diagram.

That sounds like a description of LLM-generated content to me ;-)

15 days ago 0 comments reply

LLMs only ever accidentally generate useful content. They fundamentally can't know whether the things they're outputting are true, they just tend to be, because the training data also tends to be.

15 days ago 0 comments reply

For several years now, Amazon KDP will block books whose content is already available on the web. I have printed a few books whose content was either CC-BY or public domain due to its age, and in each case my book was automatically blocked in the early stages. I had to submit an appeal that was reviewed by a person in order to proceed.

15 days ago 1 comments reply

explains the CouchDB Book from OReily from that time.

15 days ago 0 comments reply

Do people still use CouchDB? Blast from the past!

15 days ago 0 comments reply

… I assume they haven't taken them down because they make money even when they sell garbage.

I’d be surprised if this is the case. The money they make is probably a rounding error compared even just to other Kindle sales. Much more likely is that they haven’t seen it as a big enough problem - and I’m willing to bet it’s increased multiple orders of magnitude recently.

14 days ago 0 comments reply

Behind the Bastards did a 2 part episode on how these "books" are preying on children and busy parents: https://www.iheart.com/podcast/105-behind-the-bastards-29236...

15 days ago 0 comments reply

Maybe there's so much garbage coming in now that they finally have to do something about it?

It seems like this is preventative action rather than reactionary, as they say that there hasn't been an increase in publishing volume, "While we have not seen a spike in our publishing numbers..."

15 days ago 0 comments reply

I thought it was more so filled with low quality mechanical turk garbage books.

15 days ago 10 comments reply

In my opinion, all we learn over time is that we need gatekeepers (publishing houses in this case). The general public is a mess.

15 days ago 1 comments reply

I think what we’re seeing here is a symptom of the broader and more fundamental problem of trust in society. We’ve gone from a very high trust society to a very low trust society in just a few decades. We, as technology people, keep searching (desperately) for technical solutions to social problems. It’s not working.

15 days ago 0 comments reply

Because technology never was the solution for social problems, it's a solution to the few people getting very rich problem.

15 days ago 4 comments reply

The standards for filtering internet data have dropped badly.

Amazon and Google both abuse their filtering systems on a daily basis to effect social change.

We need new companies built with policies to keep the filtering systems rigid, effective and unchanging. We need filterkeepers.

15 days ago 3 comments reply

I’m good with Amazon and Google over some unknown. I don’t want some right wing shit to be my gatekeepers.

15 days ago 2 comments reply

Yay, politics in my business soup. That'll generate a quality outcome for my customers!

/s

The politics are ephemeral, the results matter.

15 days ago 1 comments reply

Human decency transcends your asinine, barely disguised political talking points.

15 days ago 0 comments reply
15 days ago 2 comments reply

Such systems just result in content that is terribly bland, or worse, intentionally limited to push specific political narratives.

I'd rather have a much more diverse and interesting set of content to choose from, even if some of it might not be to my liking, and even if I'd have to put some effort into previewing or filtering before I find something I want to consume.

15 days ago 1 comments reply

Some people value their time, energy, and money more. I can appreciate that you do not as we all have choices but I imagine that most people would disagree.

15 days ago 0 comments reply

Some people value their time, energy, and money more.

More highly "curated" media providers have almost always been the least-efficient, most-costly, and least-satisfying for me.

Buying physical books at a bookstore has typically been a costly waste of time, with the selection being poor, and it requiring time, money, vehicle wear, etc., to actually get to the store.

Public libraries are often worse in terms of selection, and thanks to the ones where I am being funded via taxation, I'm stuck paying for them even if I don't use them.

Online and ebook sellers are somewhat better, although they can still be costly, and the delivery of physical books can take some time.

I've had much better success finding fiction and non-fiction content by doing some searches and seeing which random websites, forums, and other less-"curated" online resources I happen to run across.

It has been the same for video media, too.

OTA TV is relatively cheap, but the selection is so limited as to make it useless.

Cable and satellite TV have upfront costs, and then ongoing costs, plus a relatively limited selection of content available at any given time.

Paid online streaming providers have a cost, obviously, and I've found the selection to be quite poor.

Movie theatres are extremely costly for what you get, have a tremendously limited selection, and also involve significant travel and time costs.

Tape and disc rentals no longer exist today where I am, aside from public libraries. They had per-rental costs, late fees, travel costs, and very limited selection. As stated before, I pay for the library even if I don't use it.

YouTube, on the other hand, gives me a much better experience than the more "curated" providers. With just a minute or two of searching, I can find hours and hours worth of content to watch each evening, I can view this content with almost no delay, the cost is minimal, and the content is far more entertaining and informative than the more "curated" options.

Avoiding "curated" media providers has saved me a lot of time, energy, and money, in addition to providing me with much more enjoyable and useful content.

15 days ago 13 comments reply

I think we will see tidal waves of 'not-so-good' AI-generated content. Not that AI can't generate or help generate 'good' content, but it will be faster and cheaper to generate 'not-so-good'.

These waves will mainly be in places in which we are the product. And those waves could make those places close to uninhabitable for folks who don't want to slosh through the waves of noise to find the signal.

And in turn that perhaps enables a stronger business model for high quality content islands (regardless of how the content is generated) - e.g. we will be more willing to pay directly for high quality content with dollars instead of time.

In that scenario, AI could be a_good_thing in helping to spin a flywheel for high quality content.

15 days ago 3 comments reply

Assuming not too many people die eating mushrooms while we're waiting: https://www.theguardian.com/technology/2023/sep/01/mushroom-...

15 days ago 2 comments reply

Common foraging rhetoric is that you need two independent sources asserting that a wild food is edible. Ones that cite neither each other or the same chain of citations. And preferably a human who says, "I've been eating these for years and no problems." or scientists who did recent blood work to make sure you aren't destroying your organs by eating [1].

In a world with fake books, it would be quite easy for two books to contain the same misinformation or mis-identification (how many times have I found the wrong plant in a google image search? More times than I care to count). Two fake books putting the wrong mushroom picture next to a mushroom because they were contiguous on some other page and you have dead people.

[1] In the ten years since I started working with indigenous plants, wild ginger (asarum caudatum), has gone from quasi-edible to medicinal to don't eat. More studies show subtler wear and tear on the organs (wikipedia lists it as carcinogenic!) and it is recommended now that you don't eat them at all, even for medicinal purposes. I'm not sure I own a foraging or native species book younger than 5 years, and many are older.

15 days ago 1 comments reply

Damn had no idea about wild ginger. That is a bummer.

14 days ago 0 comments reply

Isn’t it. It went from somewhere around 10th on my planting list to, “when I get really bored”.

15 days ago 8 comments reply

Except they shouldn't be islands. Unify/standardise the payment mechanism, make it frictionless and only for content consumed. There's no technical reason you shouldn't see an article on hn or wherever, follow the link and read it and pay for it without having set up and pay for a subscription for the entire publication or jump through hoops. It should be a click at most.

There will always be a place for subscriptions, but people want the hypertext model of just following a link from somewhere and there is absolutely no technical reason for that to be incompatible with paying for content. The idea that ads are the only way to fund the web needs to be challenged, and generative AI might just provide the push for that to finally happen.

Or maybe there will be no such crisis and it'll just make the whole thing even more exploitative and garbage-filled.

15 days ago 4 comments reply

There's no technical reason you shouldn't see an article on hn or wherever, follow the link and read it and pay for it without having set up and pay for a subscription for the entire publication or jump through hoops. It should be a click at most.

People have been saying this and building startups on this and having those startups crash and burn for decades.

It's not a technical problem. It's a psychology problem.

Paying after you've read an article doesn't provide the immediate post purchase gratification to make it an inpulse purchase [0]. The upside of paying for an article you've already read is more like a considered purchase [1]. But the amount of cognitive effort worth putting into deciding whether or not to pay for the article is often less than the value you got from the article itself. So it's very hard for people to force themselves to decide to commit to these kinds of microtransactions. See also [2].

It's just a sort of cognitive dead zone where our primate heuristics don't work well for the technically and economically optimal solution. It's sort of like why you can't go into a store and buy a stick of gum.

[0]: https://en.wikipedia.org/wiki/Impulse_purchase

[1]: https://en.wikipedia.org/wiki/Considered_purchase

[2]: https://en.wikipedia.org/wiki/Bounded_rationality

15 days ago 3 comments reply

I'm a bit confused here. I never said the click would be after reading the article. You would need to pay to read.

Edit: Ah, I did say

see an article on hn or wherever, follow the link and read it and pay for it

That wasn't supposed to be a chronological sequence of events, but I see I accidentally implied that. Apologies for the confusion.

14 days ago 2 comments reply

You would need to pay to read.

Now you're talking about a paywall.

Many news organizations are going in that direction but with subscriptions.

Doing it where you pay for each article is also psychologically hard. Since you are actually spending some money, the choice requires some mental effort. But since you haven't read the article, it's a choice whose value is very hard to estimate.

Choices where:

1. The value is hard to predict.

2. The cost is low.

3. The perceived maximum value is also likely low.

Are also sort of a worst case for the heuristics our brains have evolved to apply. It's difficult to get our brains to even put the mental effort into making the choice whether or not to buy the article.

14 days ago 1 comments reply

A paywall is around an entire site. You have to pay for the whole thing or not at all, and you have you do it separately for every site. Subscriptions to the NYT, WSJ, Economist and the FT would cost a fortune, and I would read a minute fraction of the content. As a result I don't have subscriptions for any of them, and none of them get a penny from me. With a common system, paid with a single click per article, I and many others would happily rack up significant tallies reading individual articles across all these publications. It's a win/win.

I don't buy the argument that people wouldn't be bothered. A popup with a balance, cost for the article and yes/no button would be far less mental effort than I already spend finding how to refuse consent on tracking popups, and about the same effort as required for those that simply click the button to grant consent. If that were too much to expect people to do nobody would be reading any articles in the EU at present.

14 days ago 0 comments reply
15 days ago 2 comments reply

It should be a click at most.

Welcome to new and interesting ways to defraud people over the internet for money school of thought.

At least with Amazon it's a "one and done shop" of who I spent my money with when I bought something.

Imagine tomorrow with your click to pay for random links on the internet you suddenly have 60,000 1 cent charges. They all appear to go different places and to get a refund you need to challenge each one.

15 days ago 0 comments reply

It sounds like the digital version of the CD scam. https://viewing.nyc/nyc-scams-101-dont-get-fooled-by-the-cd-...

15 days ago 0 comments reply

I think you're imagining this would be open to random individual bloggers, but that wouldn't solve the quality / clickbait / AI generation problem. Sure, individuals could scam, but they could also produce clickbait, low effort crap.

The context of this discussion is the high quality, paid, edited writing that is currently behind site-wide subscription paywalls at sites like the New York Times, Wall Street Journal, Financial Times, Economist, etc. It would be great to lower the barrier to entry for individual writers as far as possible, and maybe even include some sites that are run more like blogging platforms, but there would always have to be content standards and some degree of editorial control for reasons other than avoidance of scams, and with those things in place avoidance of scams is a non-issue because you're dealing with organisations that are trading on reputation. The New York Times isn't going to be defrauding its readers (and neither is Medium if it comes to that).

15 days ago 103 comments reply

While not exactly the same, the invention of the printing press caused a lot of controversy with the Catholic Church. With the printing press, people could mass produce and spread information relatively easily. I'm sure a lot of it was considered "low quality" (also heretical)[1]. Seems like we're going through similar growing pains now. Yes I know it's different, but it rhymes.

1. https://en.wikipedia.org/wiki/Index_Librorum_Prohibitorum

15 days ago 95 comments reply

I really dislike the comparison. The printing press democratized knowledge. The LLM destroys it. LLM output is perfect white noise. Enough of it will drown out all signal. And the worst part is that it’s impossible to distinguish it from real human output.

I mean think about it. Amazon had to stop publishing BOOKS because it can no longer separate the signal from the noise. The printing press was the birth of knowledge for the people and the LLM is the death.

15 days ago 7 comments reply

The printing press democratized knowledge

That's true, but it also allowed protestant "heretics" to propagate an idea that caused a permanent schism with the Catholic church, which led to centuries of wars that killed who-knows-how-many people, up to recent times with Northern Ireland.

(Or something like that, my history's fuzzy, but I think that's generally right?)

15 days ago 4 comments reply

I thought it was a king wanting a divorce, and as he couldn't get it from the catholic church, created his own.

15 days ago 0 comments reply

Henry VIII created the Church of England in 1534 for the purposes of granting himself an annulment. Most histories count Martin Luther's 95 Theses as beginning of the Reformation in 1517 (a crisp date for a less-than-crisp event; Luther did not originally see himself as protesting the Roman Catholic Church). The Protestant Reformation was a heterogeneous movement from the beginning.

15 days ago 0 comments reply

Protestantism started in Germany with Martin Luther nailing his theses to a church door. Henry's reproductive problems came later and where only sort of related.

15 days ago 0 comments reply

Not really, no. It was Luther who kick-started Protestantism. Henry VIII attempted to supplant the Pope, and kind of slid into Protestantism by accident.

15 days ago 0 comments reply

That was the case just for the anglican church, which is only one "part" of the reformation.

14 days ago 0 comments reply

Long before that in 1054, was the East - West Schism that split the Catholic and Orthodox churches. https://en.wikipedia.org/wiki/East%E2%80%93West_Schism

15 days ago 0 comments reply

LLM output is perfect white noise.

Not even close to white noise. White noise, in the context of the token space, looks like this:

auceverts exceptionthreat."<ablytypedicensYYY DominicGT portaelight\- titular Sebast Yellowstone.currentThreadrition-zoneocalyptic

which is literally the result of "I downloaded the list of tokens and asked ChatGPT to make a python script to concatenate 20 random ones".

No, the biggest problem with LLMs is that the best of them are simultaneously better than untrained humans and yet also nowhere near as good as trained humans — someone, don't remember who, described them as "mansplaining as a service", which I like, especially as it (sometimes) reminds me to be humble when expressing an opinion outside my domain of expertise, as it knows more than I do about everything I'm not already an expert at.

Specific example: I'm currently trying to use ChatGPT-3.5 to help me understand group theory, because the brilliant.org lessons on that are insufficient; unfortunately, while it knows infinitely more than I do about the subject, it is still so bad it might as well be guessing the multiple choice answers (if I let it, which I don't because that would be missing the point of using a MOOC like brilliant.org in the first place).

15 days ago 0 comments reply

Amazon had to stop publishing BOOKS because it can no longer separate the signal from the noise.

That's because they are trying very hard not to check what they are selling, hoping that their own users and a few ML algorithms can separate the signal from the noise for them. It seems to me that the approach is no longer working, and they should start doing it by themselves.

15 days ago 8 comments reply

I really feel like you can't have used any advanced LLMs if you legitimately think the out put "perfect white noise". The results that you can get from an LLM like GPT-4 are incredibly useful and are providing an enormous amount of value to lots of people. It isn't just for generating phony information to spread or having it do your work for you.

I get the most value out of asking for examples of things or asking for basic explanations or intuitions about things. And I get so much value from this that I really think the printing press is the most apt comparison.

15 days ago 6 comments reply

The problem is advanced LLMs are controlled by large corporations. Powerful local models exist (in part thanks to Meta's generosity oddly enough) and they're close to GPT-3.5, but GPT-4 is far ahead of them and by the time other models reach to that point whatever OpenAI or Antropic, Meta etc. have developed behind closed doors could be significantly better. In that case open models will be restricted to niche uses and most people will use the latest model from a giant corp.

So it is possible that LLMs will centralize the production and dissemination of knowledge, which is the opposite of what people think the printing press did. I hope I'm wrong and open models can challenge/overtake state of the art models developed by tech giants, that would be amazing.

15 days ago 5 comments reply

Precisely. I spent weeks learning about cybersecurity when GPT-4 first came out, as I could finally ask as many stupid questions as I liked, get detailed examples and use-cases for different attacks and defenses, and generally actually learn how the internet around me works.

Now it refuses, because OpenAI's morals apparently don't include spreading openly available knowledge about how to defend yourself.

Scary. I have also been using it to generate useful political critiques (given a particular theoretical tradition, some style notes, and specific articles to critique, it's actually excitingly good). What if OpenAI decides that's a threat? What reason do we have to think that a powerful institution would not take this course of action, in the cold light of history?

15 days ago 4 comments reply

how do you know what you learnt wasn't completely made up gibberish?

15 days ago 2 comments reply
15 days ago 0 comments reply
15 days ago 0 comments reply

What you say is not in conflict with AI-generated content being white noise. Even if you find some piece of AI-generated content useful, it is still white noise if it is merely combining pieces of information found in its dataset and the result is posted online or published elsewhere. There is no signal being added in that process, and it pollutes the space of content. Humans are also prone to doing this, but with the help of AI, it becomes a much larger issue.

"Signal" would mean new data, which is by definition not possible via LLMs trained on publicly available content, since that means the data is already out there, or new and meaningful ideas or innovations beyond just combining existing material. I have not seen LLMs accomplish the latter. I consider it at least possible that they are capable of such a feat, but even then the relevant question would be how often they produce such things compared to just rearranging existing content. Is the proportion high enough that unleashing floods of AI-generated content everywhere would not lower the signal-to-noise ratio from the pre-AI situation?

15 days ago 9 comments reply

the worst part is that it’s impossible to distinguish it from real human output

Doesn't that make human content look bad in the first place?

If we can't distinguish a Python book written by a human engineer or by ChatGPT, how can we demonstrate objectively that the machine-generated one is so much worse?

15 days ago 0 comments reply

That argument might work for content which serves a purely informational purpose, such as books teaching the basics of programming languages, for instance, but it doesn't work for art (e.g. works of fiction) because most of the potential for a non-superficial reading of a work relies on being able to trust that there is an author that has made a conscious effort to convey something through that work, and that that something can be a non-obvious perspective on the world that differs from that of the reader. AI-generated content does not have any such intent behind it, and thus you are effectively limited to a superficial reading, or if were to instist on assigning such intent to AI, then at most you would have one "author" per AI model, which additionally has no interesting perspectives to offer, simply those perspectives deemed acceptible in the culture of whatever group of people developed the model, no perspective that could truly surprise or offend the reader with something they had not yet considered and force them to re-evaluate their world view, just a bland average of their dataset with some fine tuning for PR etc. reasons.

15 days ago 6 comments reply

The problem is not that no one can distinguish it. It's that the intended audience (beginners in Python in your example) can't distinguish it and are not able to easily find and learn from trusted sources.

15 days ago 3 comments reply

Aren't there already bad Python books written by humans?

I bet ChatGPT can come up with above-average content to teach Python.

We should teach beginners how to prompt engineer in the context of tech learning. I bet it's going to yield better results than gate-keeping book publishing.

15 days ago 1 comments reply

There are, but it used to take actual time and effort to produce a book (good or bad), meaning that the small pool of experts in the world could help distinguish good from bad.

Now that it’s possible to produce mediocrity at scale, that process breaks down. How is a beginner supposed to know whether the tutorial they’re reading is a legitimate tutorial that uses best practices, or an AI-generated tutorial that mashes together various bits of advice from whatever’s on the internet?

15 days ago 0 comments reply
15 days ago 0 comments reply

Another great contribution would be fine-tuning open source LLMs on less popular tech. I've seen ChatGPT struggling with htmx, for example (I presume the training dataset was small?), whereas it performs really well teaching React (huge training set, I presume)

15 days ago 1 comments reply

If beginners in Python programming are not capable of visiting python.org, assuming they are genuinely interested in learning Python, it would be very questionable how good their knowledge on the subject can really be.

15 days ago 0 comments reply

100% agreed.

I've seen many developers using technologies without reading the official documentation. It's insane. They make mistakes and always blame the tech. It's ludicrous...

15 days ago 0 comments reply

We can distinguish it. That's what publishers and editors do. It's also what book buyers for book chains used to do. Reviewers, writing for reputable publications, with their own editors and publishers, as well.

Humans, examining things, and putting a reputation that matters on the line to vouch for it.

The fact that Amazon doesn't want to have smart, contextually aware humans look at and evaluate everything people propose to offer up for sale on their storefront doesn't mean it can't be done. Same as how Google doesn't want to look at every piece of content uploaded to YouTube to figure out if it's suitable for kids, or includes harmful information. That's expensive, so they choose not to do it.

15 days ago 18 comments reply

The LLM does democratize knowledge, but you have to be the user of the LLM, not the target of the user of the LLM.

The LLM is the most powerful knowledge tool ever to exist. It is both a librarian in your pocket. It is an expert in everything, it has read everything, and can answer your specific questions on any conceivable topic.

Yes it has no concept of human value and the current generation hallucinates and/or is often wrong, but the responsibility for the output should be the user's, not the LLM's.

Do not let these tools be owned, crushed and controlled by the same people who are driving us towards WW3 and cooking the planet for cash. This is the most powerful knowledge tool ever. Democratize it.

15 days ago 11 comments reply

Asking a statistics engine for knowledge is so unfathomable to me that it makes me physically uncomfortable. Your hyperbolic and relentless praise for a stochastic parrot or a "sentence written like a choose your own adventure by an RNG" seems unbelievably misplaced.

LLMs (Current-generation and UI/UX ones at least) will tell you all sorts of incorrect "facts" just because "these words go next to each other lots" with a great amount of gusto and implied authority.

15 days ago 3 comments reply

My mind is blown that someone gets so little value out of an LLM. I get over software engineering stumbling blocks much faster by interrogating an LLM's knowledge about the subject. How do you explain that added value? Are you skeptical that I am actually moving and producing things faster?

15 days ago 2 comments reply

My mind is also blown by how much people seemingly get out of them.

Maybe they’re just orders of magnitude more useful at the beginning of a career, when it’s more important to digest and distill readily-available information than to come up with original solutions to edge cases or solve gnarly puzzles?

Maybe I also simply don’t write enough code anymore :)

15 days ago 1 comments reply
15 days ago 0 comments reply

This happened to me looking up am obscure c library. It just confidently made up a function that didn't actually exist in the library. It got me unstuck but you can really fuck yourself if you trust it blindly.

15 days ago 4 comments reply

I agree with you but at what point does it change? Aren’t we all just stochastic parrots? How do we ourselves choose the next word in a sentence?

15 days ago 0 comments reply

In my view, one big learning from LLMs is that yes, more often than not we are just stochastic parrots. And more often than not that's enough!

But sometimes we're more than that: Some types of deep understanding aren't verbal or language-based, and I suspect that these are the ones that LLMs will have the hardest time getting good at. That's not to say that no AI will get there at all, but I think it'll need something fundamentally different from LLMs.

For what it's worth, I've personally changed my mind here: I used to think that the level of language proficiency that LLMs demonstrate easily would only be possible using an AGI. Apparently that's not the case.

15 days ago 0 comments reply

If you wish to make an apple pie, first you must make the universe from scratch. (carl sagan)

We can generate thoughts that are spatially coherent, time aware, validated for correctness and a whole bunch of other qualities that LLMs cannot do.

Why would LLMs be the model for human thought, when it does not come close to the thoughts humans can do every minute of every day?

Aren't we all just stochastic parrots, is the kind of question that requires answering an awful lot about the universe before you get to an answer.

15 days ago 0 comments reply

We use languages to express ideas. Sentences are always subordinate to the ideas. It's very obvious when you try to communicate in another language you're not fluent in. You have the thought, but you can't find the words. The same thing happens when writing code, taking ideas from the business domain and translating it into code.

15 days ago 0 comments reply

God dammit please stop comparing these things to brains. Stop it. It's not even close.

15 days ago 1 comments reply

but the responsibility for the output is the user's, not the LLM's.

The current iteration of the internet (more specifically social media) has used the same rationality for its existence but at a level, society has proven itself too irresponsible and/or lazy to think for itself but be fed by the machine. What makes you think LLMs are going to do anything but make the situation worse? If anything, they’re going to reenforce whatever biases were baked into the training material, of which is now legally dubious.

15 days ago 1 comments reply

and can answer your specific questions on any conceivable topic

Yeah, I mean, so can I, as long as you don't care whether the answers you receive are accurate or not. The LLM is just better at pretending it knows quantum mechanics than I am.

15 days ago 0 comments reply

Even if a human expert responds about something in their domain of expertise, you have to think critically about the answer. Something that fails 1% of the time is often more dangerous than something that fails 10% of the time.

The best way to use an LLM for learning is to ask a question, assume it's getting things wrong, and use that to probe your knowledge which you can iteratively use to prove the LLM's knowledge. Human experts don't put up with that and are a much more limited resource.

15 days ago 1 comments reply

For a librarian, they’re confidently asserting factual statements suspiciously often, and refer me to primary literature shockingly rarely.

15 days ago 0 comments reply

In other words they behave like a human?

15 days ago 36 comments reply

If you asked the Church back then, they would tell you that the printing press was the death of truth, because to them only the word of god was truth, and only the church could produce it.

It's all just a matter of perspective.

Yes, right now it looks like white noise, just like back then it looked like white noise which could drown out the religious texts. But we managed to get past it then and I'm sure we'll manage now.

15 days ago 32 comments reply

This is an astoundingly bad take. Surely you aren't trying to suggest that original, factual, human-authored content has no more inherent value than randomly generated nonsense?

15 days ago 0 comments reply

That's Wittgenstein's argument.

15 days ago 5 comments reply

No not at all, I'm not sure why you would even think that.

15 days ago 4 comments reply

As I read it, your parent comment suggests that the distinction in quality and utility between human-authored and AI-generated content is merely "a matter of perspective", i.e. that there is no real distinction, and that they're both equally valuable.

If you actually meant something else, you should probably clarify.

15 days ago 1 comments reply
15 days ago 0 comments reply
15 days ago 24 comments reply

The discussion here is that we're not able to distinguish them.

If we cannot distinguish, I'd argue they have similar value.

They must have. Otherwise, how can we demonstrate objectively the higher value in the human output?

15 days ago 6 comments reply

They can be distinguished. They are just becoming more difficult to. Its slightly-more difficult, but also the amount of garbage is overwhelming. AI can spit out entire books in moments that would take an individual months or years to write.

There are lots of fake recipe books on amazon for instance. But how can you really be sure without trying the recipes? It might look like a recipe at first glance, but if its telling you to use the right ingredients in a subtly-wrong way, its hard to tell at first glance that you won't actually end up with edible food. Some examples are easy to point at, like the case of the recipe book that lists Zelda food items as ingredients, but they aren't always that obvious.

I saw someone giving programming advice on discord a few weeks ago. Advice that was blatantly copy/pasted from chat GPT in response to a very specific technical question. It looked like an answer at first glance, but the file type of the config file chat GPT provided wasn't correct, and on top of that it was just making up config options in attempt to solve the problem. I told the user this, they deleted their response and admitted it was from chatGPT. However, the user asking the question didn't know the intricacies of "what config options are available" and "what file types are valid configuration files". This could have wasted so much of their time, dealing with further errors about invalid config files, or options that did not exist.

15 days ago 1 comments reply
15 days ago 3 comments reply
15 days ago 0 comments reply

A piece of human-written content and a piece of AI-written content may have similar value if we cannot distinguish between them. But if you can add the information that the human-written content was written by a human to the comparison, the human-written content becomes significantly more valuable, because it allows for a much deeper reading of the text, since the reader can trust that there has been an actual intent to convey some specific set of ideas through the text. This allows the reader to take a leap of faith and put in the work required to examine the author's point of view, knowing that it is based on the desires and hopes of an actual living person with a lifetime of experience behind them instead of being essentially random noise in the distribution.

15 days ago 2 comments reply

I'm not a native English speaker, but ChatGPT answers in each interaction I had with it sound bland. And I dislike the bite-sized format of it. I'm reading "Amusing Ourselves to Death" by Neil Postman and while you may agree or disagree with his take, he developed it in a very coherent way, exploring several aspects. ChatGPT's output falls into the same uncanny valley as the robotic voice from text to speech software, understandable, but no human does write that way.

ChatGPT as an autocompletion tool is fine, IMO. As well as generating alternative sentences. But anything longer than a paragraph falls back to the uncanny valley.

15 days ago 1 comments reply
15 days ago 2 comments reply

If you ask LLM something you know you can distinguish noise from good output. If you ask LLM something you don’t know then how do you know if the output is correct? There are cases where checking is easier than producing the result, e.g. when you ask for a reference.

15 days ago 1 comments reply
15 days ago 7 comments reply

I can't distinguish between pills that contain the medicine that I was prescribed and those than contain something else entirely. Therefore taking either should be just as good.

15 days ago 6 comments reply
15 days ago 1 comments reply

If they were of similar value would there be a problem with the deluge?

15 days ago 0 comments reply
15 days ago 1 comments reply

I'd argue that giving a group with unique thoughts and ideas a voice is different than creating a noise machine.

15 days ago 0 comments reply

I think the jury is still out on whether an LLM produces ideas any more or less unique than most humans. :)

15 days ago 0 comments reply

The printing press made books cheap relative to hand copied books, but they were still expensive for most people.

Before the printing press two books cost around the same as a 2 story cottage.

Afterwards a couple books would be about a month of wages for a skilled worker.

That greatly limits ones ability to drown out anything with books.

15 days ago 1 comments reply

The printing press democratized knowledge.

Not for centuries. Due to the expense of the technology and the requirement in some locations for a royal patent to print books, the printing press just opened up knowledge a bit more from the Church and aristocracy to the bourgeoisie, but it did little for the masses until as late as the 1800s.

15 days ago 0 comments reply

A big part of this is that literacy didn’t come to the masses until the 1800s. But in England and the Netherlands you had (somewhat) free press by the late 1600s and early 1700s.

15 days ago 0 comments reply

I'm reminded of the Library of Babel

15 days ago 0 comments reply

I was told publishers dont promote a good book anymore these days. They ask how many instagram followers do you have?

Maybe the self-publishing and BoD will decline in the long term due to ML white noise and publishers are a sign of quality again.

15 days ago 0 comments reply

You could argue that speech is literally noise that drowns out the signals of your environment. If you just babbled, it would be useless, but instead you use it intelligently to communicate ideas. LLM output is a new palette with which humans can compose new signals. We just have to use it intelligently.

Prompt engineering is an example of this. A clever prompt by a domain expert can prime an LLM interaction to yield better information to the recipient in a way that the recipient themselves could not have produced on their own.

15 days ago 0 comments reply

People comparing the AI bullshit spigot to the printing press are clowns.

15 days ago 1 comments reply

It used to be that a scribe would painstakingly copy a manuscript, through the process absorbing the text at a deep level. This same scribe could then apply this knowledge to his own writing, or just understand and curate existing work. The manual labor required to copy at scale employed many scribes, who formed the next generation of thinkers.

With the press, a greasy workman can churn out hundreds of copies an hour, for whichever charlatan or heretic palms him enough coin. The people are flooded with falsehoods by men whose only interest in writing is how many words they can fit on a page, and where to buy the cheapest ink.

The worst part is that it is impossible to distinguish the work of a real thinker from that of a cheap sophist, since they are all printed on the same rough paper, and serve equally well as tomorrow's kindling.

15 days ago 0 comments reply

Where are the good AI-generated books that serve as the positive side of this development?

15 days ago 0 comments reply

You're implying that what is being produced has actual value, the problem is they're acting in patently bad faith. Weep not for the spammers.

15 days ago 0 comments reply

the invention of the printing press caused a lot of controversy with the Catholic Church

> https://en.wikipedia.org/wiki/Index_Librorum_Prohibitorum

The example is from the sixteenth century, but the printing press is from the seventh century.

I don't think the Catholic Church bothered to take any notice at all?

15 days ago 0 comments reply

The rhyme has a lot to do with how existing power structures handle a sudden increase in the amount of written text generated. In this comparison, they both try to apply the breaks. Banned books didn't work well for the Catholic Church. I think increasing QA for Amazon might actually help their book business. Of course, a book seller has a greater responsibility to society than to make money.

15 days ago 2 comments reply

similar growing pains

For what it's worth, these 'growing pains' took the form of the wars of religion in Europe, which in Germany killed up to 30% of the population, that's in relative terms significantly worse than the casualties of World War I and II. So maybe the Catholic Church had a point

15 days ago 1 comments reply

> So maybe the Catholic Church had a point

Is that really the take-away? If the Catholic Church had not been so belligerent, those wars would not have been needed. Now that we are past that time, we should surely be thanking those combatants who helped disseminate knowledge in spite of the Church whose interest was in hoarding it.

15 days ago 0 comments reply

I think that's a pretty bad reading of history frankly. The Church didn't hoard knowledge, in fact they were arguably the primary preservers of knowledge and disseminator of it, through the monastic tradition in Medieval Europe. Many thousands of which were destroyed during the religious wars, which is a common theme as far sectarian wars go. They are first and foremost destroyers of knowledge.

More importantly I certainly wouldn't want to live through that period for any reason, and much less repeat it. If an ordinary printing press caused that much chaos I'm not sure I want to figure out what one on steroids is going to do

15 days ago 0 comments reply

The incentive system is completely different. The new AI generated content is for a quick buck, just spamming out content because $1 x 10,000 is a lot.

If it was written with the aid of AI, that's different. At least someone tried to make something good and just used avalible tools to enhance the quality.

15 days ago 11 comments reply

How do we...

I'm not entirely sure how to word this question.

How do we make sure that most of the people we talk to are at least humans if not necessarily the person we expect them to be? And I'm not saying that like a cartoonish bad guy in a movie who hates artificial intelligence and augmented humans.

How do I not get inundated by AI that's good at trolling. How do I keep the social groups I belong to from being trolled?

These questions keep drawing me back to the concept of Web of Trust we tried to build with PGP for privacy reasons. Unless I've solicited it, I really only want to talk to entities that pass a Turing Test. I'd also like it if someone actively breaking the law online were actually affected by the deterrence of law enforcement, instead of being labeled a glitch or a bug in software that can't be arrested, or even detained.

It feels like I want to talk to people I know to be human (friends, famous people - who might actually be interns posing as their boss online), and people they know to be human, and people those people suspect to be human.

I have long term plans to set up a Wiki for a hobby of mine, and I keep getting wrapped around the axle trying to figure out how to keep signup from being oppressive and keep bots from turning me into an SEO farm.

15 days ago 5 comments reply

This is only a problem for someone terminally online. The vast majority of people talk to their friends and coworkers in person.

15 days ago 0 comments reply

That was the solution that came to mind to me too, but it doesn't work either.

Even if you're never online and only talk to people in person... over time those people will be increasingly informed by LLM-generate pseudo-knowledge. We aren't just training the AIs. They're training us back.

If you want to live in a society where the people you interact with have brains mostly free of AI-generated pollution, then I'm sorry but that world isn't going to be around much longer. We are entering the London fog era of the Information Age.

15 days ago 1 comments reply

I don't trust my friends for medical advice. Some of them trust me for plant advice, and they really probably shouldn't. I am very stove-piped.

We have two and a half generations of people right now most of whom think "I did the research" means "I did half as much reading as the average C student does for a term paper, and all of that reading was in Google."

And Alphabet fiddles while Google burns. This is going to end in chaos.

15 days ago 0 comments reply

"I did the research" means "I did half as much reading as the average C student does for a term paper

What's the alternative? No one who says that is saying they did original research, they're saying they searched around and got what they believe to be at least a consensus among the body of experts they trust.

Like I agree the problem sucks but I have no idea what a solution looks like. For fields someone is totally unfamiliar with they simultaneously don't have enough knowledge to evaluate the truth of a claim nor the knowledge to evaluate if someone is qualified and trustworthy enough to believe them. It's turtles all the way down -- especially because topics of any interest you can find as many experts as you care to of whatever qualification you demand making all sorts of contradictory claims.

15 days ago 0 comments reply

This is only a problem for someone terminally online.

Is it? Even those whose social life is entirely IRL, they still have to increasingly interact with various businesses, banks, healthcare providers, the government, and often more distant collegues through online services. Do I want these to go through LLM chatbots? No. Can I ensure that I'm speaking to an actual human if the communication is text based? Not really.

15 days ago 0 comments reply

This is a problem for anyone who is not actively vigilant about the information they consume. A family member (who I would not describe as "terminally online") came to me today in a panic talking about how some major event had just occurred and how social order was beginning to collapse. I quickly glanced at the headlines on a few major news outlets and realized that they just saw some incendiary content designed to elicit that reaction. I calmed them down and walked them through a process they could use to evaluate information like that in the future, and they were a little embarrassed.

The concern isn't necessarily for you. It's for the large swaths of people who are less equipped to filter through noise like this.

15 days ago 1 comments reply

Meet people in real life. This problem is trivially solved by just using meatspace.

Alternatively for sign ups, tell them to contact you and ask. Chat with them a moment. Ask them about their hobbies and family.

15 days ago 0 comments reply

Using meatspace doesn't solve the problem, using meatspace exclusively solves the problem. And it's not a great one given, you know, how much of the world "happens" online now.

15 days ago 1 comments reply

There is some irony in Sam Altman bringing us the cause (AI) and purported solution (Worldcoin) for your problem at the same time.

15 days ago 0 comments reply

It's what ad men do. Point out there's a problem, offer you the solution.

15 days ago 0 comments reply

we don't, check Boltzmann brain https://en.m.wikipedia.org/wiki/Boltzmann_brain

15 days ago 4 comments reply

See also: "Tom Lesley has published 40 books in 2023, all with 100% positive reviews"

https://dstill.ai/hackernews/item/35687868

15 days ago 3 comments reply

I remember that one - interestingly the amazon link it goes to shows only 3 books now, all that look real, not the 40 that I remember seeing before.

So I guess Amazon is doing something even though I regularly hear complaints from authors that they allow blatant piracy all the time

15 days ago 0 comments reply

shows only 3 books now

Those appear to be by different authors with similar names: https://www.amazon.com/s?k=%22tom+lesley%22

15 days ago 0 comments reply

Amazon has no reason to give a shit about piracy on KDP: they make money either way. But having a load of AI generated garbage on your platform makes it far less valuable. You want your stolen books to actually be good. :P

15 days ago 0 comments reply

Possibly it's the author removing them at the first one star rating to keep their author score high?

15 days ago 1 comments reply

It seems Amazon cares more about polluting search results in Kindle than polluting the search results in their own e-commerce business. I think low-effort books generated by AI are much less detrimental than sketchy physical products being shipped to your door in 2 days or less.

15 days ago 0 comments reply

It's probably about volume rather than quality. Sketchy copycat product lines are still hard limited by the number of factories and shipping operations in existence, while sketchy AI-generated books can easily keep growing exponentially in number for a while.

15 days ago 0 comments reply

The title of this story doesn’t seem to match the content. This seems like a proactive move to prevent individual publishers from spamming many many submissions - and even then, they’re willing to make exceptions.

While we have not seen a spike in our publishing numbers, in order to help protect against abuse, we are lowering the volume limits we have in place on new title creations. Very few publishers will be impacted by this change and those who are will be notified and have the option to seek an exception.

15 days ago 2 comments reply

Livestreams where artists show their creative process and use the streaming platform to immediately sell the thing they produced, just to prove it had human origins.

This is the future

15 days ago 0 comments reply

We have realtime filters, avatars, translators, TTS, etc. Soon, all of this will be "good enough" to mimic the proposed solution.

15 days ago 0 comments reply

You're only kicking the can down the road.

15 days ago 3 comments reply

> We require you to inform us of AI-generated content (text, images, or translations) when you publish a new book or make edits to and republish an existing book through KDP. AI-generated images include cover and interior images and artwork. You are not required to disclose AI-assisted content.

15 days ago 1 comments reply

Their distinction:

AI-generated: We define AI-generated content as text, images, or translations created by an AI-based tool. If you used an AI-based tool to create the actual content (whether text, images, or translations), it is considered "AI-generated," even if you applied substantial edits afterwards. AI-assisted: If you created the content yourself, and used AI-based tools to edit, refine, error-check, or otherwise improve that content (whether text or images), then it is considered "AI-assisted" and not “AI-generated.” Similarly, if you used an AI-based tool to brainstorm and generate ideas, but ultimately created the text or images yourself, this is also considered "AI-assisted" and not “AI-generated.” It is not necessary to inform us of the use of such tools or processes.

https://kdp.amazon.com/en_US/help/topic/G200672390#aicontent....

15 days ago 0 comments reply

Allowing the use of tools to modify the contents erases any clear distinction between the categories.

15 days ago 0 comments reply

This is really interesting. I imagine that AI-generated art / illustrations for books mostly-text is a pretty compelling thing for authors, for all the same reasons that AI-generated text is of value for non-authors. I wonder how this line will work out in practice.

15 days ago 1 comments reply

This doesn’t seem surprising. Half of my YouTube ads these days are for some kind of AI+Kindle-based get rich quick scheme.

15 days ago 0 comments reply

About time, YouTube is full of videos about making eBook's with ChatGPT. e.g "Free Course: How I Made $200,000 With ChatGPT eBook Automation at 20 Years Old" https://www.youtube.com/watch?v=Annsf5QgFF8

15 days ago 1 comments reply

Strategically, AI generated content is a boon for platforms like Amazon.

1. The more content there is, the more you can't reliably get good stuff without reviews, the more centralized distribution platforms with reviews and rankings are needed. 2. Even if people are making fake books for money laundering, Amazon gets a cut of all sales, laundered or not.

Just like Yahoo's directory once upon a time though, and Movie theaters, the party gets ruined when most people learn they can use AI to generate custom stories at home and/or converse with the characters and interact in far more ways than currently possible. Content is going from king to commodity.

15 days ago 0 comments reply

amazon's reviews and rating are completely garbage and have been for some time

15 days ago 0 comments reply

This sounds like a commendable move by Amazon. I especially like the idea of requiring disclosure of use of "AI".

15 days ago 0 comments reply

Here's a pretty good article about the problem with AI generated books. "AI Is Coming For Your Children" [1]

[1] https://shatterzone.substack.com/p/ai-is-coming-for-your-chi...

15 days ago 10 comments reply

Why do people read contemporary books is something I can’t really get my head around. There’re so many classics to keep people busy for life - and are 100% guaranteed to be insightful and pleasurable.

15 days ago 0 comments reply

Should people stop telling new stories? A century from now the best books of today will be classics. Books can act as a time capsule of a certain time and place and mode of life. And that has value.

15 days ago 0 comments reply

Contemporary books are just new classics. It is like asking why read :)

15 days ago 1 comments reply

There’s a distinct demographic in the contemporary-fiction-reading community, as can be seen in corners of Goodreads or Instagram, that demands new fiction to tell the stories of groups not covered, or supposedly unfairly covered, in that classic literature: LGBT, BIPOC, the working class, etc. In fact, they might even deny that the classics are “insightful and pleasurable” due to these social concerns.

15 days ago 0 comments reply

That’s really weird. People are making all kinds of books and stories. And stories are relevant to their time. The matrix wouldn’t be written in 1900, a tale of two cities wouldn’t be written in 1200, …

It is true though that if you have a culturally diverse set of friends and are open to their experiences and opinions, a lot of “the classics” start to smell bad. Imagine being black and reading Grapes of Wrath. You might think the situation of the main characters as humorous or infantile, considering how relatively fortunate they are.

15 days ago 2 comments reply

What's the name of the law where the longer something has already been around, the longer it will likely stay around in the future?

I've found that it definitely applies to books. Starting at a ~20 year horizon is a surprisingly good filter for quality.

15 days ago 1 comments reply

What's the name of the law where the longer something has already been around, the longer it will likely stay around in the future?

The Lindy effect.

15 days ago 0 comments reply

Thank you.

15 days ago 0 comments reply

Yes, and there's been a drop in quality since then too. The 1800-1940s really saw literature as the high water mark for quality media and it shows.

Finding deeply valuable and high quality books is much rarer in today's crop of authors. The best minds are rarely making the medium of literature their highest good, but are instead chasing dollars and relations with the rich and famous.

15 days ago 0 comments reply

I think the risk of reading a suboptimal book is not greater than the risk of not allowing myself to be exposed to different voices.

15 days ago 0 comments reply

One of the best books I read last year was the story of the rescue of the football team that was trapped in a flooded cave in 2018 – written by cave diver Rick Stanton, who found the team and led the rescue. How would that account have been written into a book before it happened?

15 days ago 0 comments reply

This is just a tip of the iceberg, compared to what we are heading into with the web. Very concerning.

I would go long the value of genuine human writing, aka the 'small web'.

15 days ago 7 comments reply

Gee, I sure hope people don't just lie about it...

15 days ago 6 comments reply

It doesn’t matter. It’s garbage content and immediately recognizable as being AI generated.

It is absolutely possible to write a good article or even a good book with AI, but at least for now it’s just as hard, if not harder, than doing it without AI.

But of course people trying to make a quick buck won’t put in the required effort, and they likely don’t even have the ability to create great or even good content.

15 days ago 2 comments reply

It’s garbage content and immediately recognizable as being AI generated.

It's also recognizable by its sheer volume. An "author" who submits several new books every day is clearly not doing their own writing. The AI publishing scam relies on volume -- they can't possibly win on quality, but they're hoping to make up for that by putting so many garbage books on the market that buyers can't find anything else.

15 days ago 1 comments reply

I'm not sure. Ghostwriting exists, and a person (or organization) with enough money could easily pay enough ghostwriters to output at a more than human pace.

15 days ago 0 comments reply

Even at their most prolific, a ghostwritten author still probably wouldn't publish more than one or two books a month. Beyond that point, you're just competing with yourself. (For instance, young adult series like Goosebumps, The Baby-Sitters Club, or Animorphs typically published a book every month or two.)

Publishing multiple books per day is out of the question. That's beyond even what's reasonable for an editor to skim through and rubber-stamp.

15 days ago 0 comments reply

It doesn’t matter. It’s garbage content and immediately recognizable as being AI generated.

Is it? How do you immediately recognize a book as AI generated before buying it, if the author isn't doing something silly like releasing several books per day/month? And even after you buy a book, how can you distinguish between the book just being terrible and the book being written with extensive use of AI? I don't believe AI can write good books, but I would still like to distinguish those two cases, since the former is just a terrible book, which is perfectly fine, while the latter I would like to avoid. I don't want to waste my limited time reading AI content.

15 days ago 0 comments reply

It is absolutely possible to write a good article or even a good book with AI, but at least for now it’s just as hard, if not harder, than doing it without AI.

How hard is it though, to create a shitty book with AI, that Amazon can't detect was written with AI?

15 days ago 0 comments reply

It’s garbage content and immediately recognizable as being AI generated.

Yea, but the Turning Test is actively being assaulted. Soon we won't know the difference between an uninspired book written by an AI and an uninspired book written by a human.

15 days ago 1 comments reply

If KDP required an ISBN it would cut down on the garbage books. In the US at least, ISBN's cost money.

15 days ago 0 comments reply

they're not that much but you can just get an Australian one for free

15 days ago 0 comments reply

So, what are the actual limits?

15 days ago 0 comments reply

Finally, I hope those garbage books will slightly decrease from there.

15 days ago 0 comments reply

Mushroom picking guides on AI "what could possibly go wrong"

15 days ago 3 comments reply

How do we even know this entire comment thread isn't polluted with AI?

Maybe it doesn't matter. The quality of the work matters more than the process of actualization.

15 days ago 1 comments reply

In a practical sense: AI generated stuff is crappy and often subtly wrong and it can be generated faster than human generated content. So it becomes untenable to even search for good information.

15 days ago 0 comments reply

Then it's good for fiction. Lots of demand for fiction.

15 days ago 0 comments reply

It seem that as a society we are coming to realize that enabling anyone to do anything on their own and at anytime isn't the best of ideas.

Verifiability and authenticity matter and are valuable. Amazon has long had a problem of fake reviews. This issue with kindle books seems an extension of that. Massive centralized platforms like Amazon makes fraud more likely and is bad for the consumer.

The "decentralization" that we need as a society is not in the form of any crypto based technical capability but simply for the size of the massive players to be reduced so competition can reemerge and give consumers more options on where and how to spend their dollars. Other E-book stores may just pop up that develop relationships with publishers and disallow independent publishing if amazon were forced to be broken up.

I hope the FTC can begin finding a strategy to force some of these massive corporations to split making it more likely for there to be more competition.

Your own API keys

While we work on scaling free distillation to more content and while we are figuring out payment options, you can use your own API keys.

Your API keys are stored in your browser, and never on our servers.

Join the waitlist

We will let you know when this and other new features are generally available.

Unknown error occured.