The best part about this is that you know the type of people/companies using langchain are likely the type that are not going to patch this in a timely manner.
I'll admit that I haven't looked it in a while, but as originally released, it was a textbook example on how to complicate a fundamentally simple and well-understood task (text templates, basically) with lots of useless abstractions that made it all sound more "enterprise". People would write complicated langchains, but then when you looked under the hood all it was doing is some string concatenation, and the result was actually less readable than a simple template with substitutions in it.
I am not sure what's the stereotype, but I tried using langchain and realised most of the functionality actually adds more code to use than simply writing my own direct API LLM calls.
Overall I felt like it solves a problem doesn't exist, and I've been happily sending direct API calls for years to LLMs without issues.
When my company organized an LLM hackathon last year, they pushed for LangChain.. but then instead of building on top of it I ended up creating a more lightweight abstraction for our use-cases.
JSON Structured Output from OpenAI was released a year after the first LangChain release.
I think structured output with schema validation mostly replaces the need for complex prompt frameworks. I do look at the LC source from time to time because they do have good prompts backed into the framework.
IME you could get reliable JSON or other easily-parsable output formats out of OpenAI's going back at least to GPT3.5 or 4 in early 2023. I think that was a bit after LangChain's release but I don't recall hitting problems that I needed to add a layer around in order to do "agent"-y things ("dispatch this to this specialized other prompt-plus-chatgpt-api-call, get back structured data, dispatch it to a different specialized prompt-plus-chatgpt-api-call") before it was a buzzword.
No dig at you, but I take the average langchain user as one who is either a) using it because their C-suite heard about at some AI conference and had it foisted upon them or b) does not care about software quality in general.
I've talked to many people who regret building on top of it but they're in too deep.
I think you may come to the same conclusions over time.
It's definitely LLM generated. I came here to post that, then you saw you had already pointed it out. Giveaway for me: 'The most common real-world path here is not “attacker sends you a serialized blob and you call load().” It’s subtler:'
It's not, it's; bolded items in list.
Also no programmer would use this apostrophe instead of single quote.
CVE-2025-68664 (langchain-core): object confusion during (de)serialization can leak secrets (and in some cases escalate further). Details and mitigations in the post.
LLM slop. At least one clear error (hallucination): "’Twas the night before Christmas, and I was doing the least festive kind of work: staring at serialization"
Per disclosure timeline the report was made on December 4, it was definitely not the night before Christmas when you were doing the work then.
Cheers to all the teams on sev1 calls on their holidays, we can only hope their adversaries are also trying to spend time with family. LangGrinch, indeed! (I get it, timely disclosure is responsible disclosure)
If I want to cleanup, summarize, translate, make more formal, make more funny, whatever, some incoming text by sending it through an LLM, I can do it myself.
I would rather read succinct English written by a non-native speaker filled with broken grammar than overly verbose but well-spelled AI slop. Heck, just share the prompt itself!
If you can't be bothered to have a human write literally a handful of lines of text, what else can't you be bothered to do? Why should I trust that your CVE even exists at all - let alone is indeed "critical" and worth ruining Christmas over?
Unfortunately, the sheer amount of ChatGPT-processed texts being linked has for me become a reason not to want to read them, which is quite depressing.
> I prefer reading the LLM output for accessibility reasons.
And that's completely fine! If you prefer to read CVEs that way, nobody is going to stop you from piping all CVE descriptions you're interested in through a LLM.
However, having it processed by a LLM is essentially a one-way operation. If some people prefer the original and some others prefer the LLM output, the obvious move is to share the original with the world and have LLM-preferring readers do the processing on their end. That way everyone is happy with the format they get to read. Sounds like a win-win, no?
However, there will be cases where lacking the LLM output, there isn't any output at all.
Creating a stigma over technology which is easily observed as being, in some form, accessible is expected in the world we live. As it is on HN.
Not to say you are being any type of anything, I just don't believe anyone has given it all that much thought. I read the complaints and can't distinguish them from someone complaining that they need to make some space for a blind person using their accessibility tools.
> However, there will be cases where lacking the LLM output, there isn't any output at all.
Why would there be? You're using something to prompt the LLM, aren't you - what's stopping you from sharing the input?
The same logic can be applied in an even larger extent to foreign-language content. I'd 1000x rather have a "My english not good, this describe big LangChain bug, click <link> if want Google Translate" followed by a decent article written in someone's native Chinese, than a poorly-done machine translation output. At least that way I have the option of putting the source text in different translation engines, or perhaps asking a bilingual friend to clarify certain sections. If all you have is the English machine translation output, then you're stuck with that. Something was mistranslated? Good luck reverse engineering the wrong translation back to its original Chinese and then into its proper English equivalent! Anyone who has had the joy to deal with "English" datasheets for Chinese-made chips knows how well this works in practice.
You are definitely bringing up a good point concerning accessibility - but I fear using LLMs for this provides fake accessibility. Just because it results in well-formed sentences doesn't mean you are actually getting something comprehensible out of it! LLMs simply aren't good enough yet to rely on them not losing critical information and not introducing additional nonsense. Until they have reached that point, their user should always verify its output for accuracy - which on the author side means they were - by definition - also able to write it on their own, modulo some irrelevant formatting fluff. If you still want to use it for accessibility, do so on the reader side and make it fully optional: that way the reader is knowingly and willingly accepting its flaws.
The stigma on LLM-generated content exists for a reason: people are getting tired of starting to invest time into reading some article, only for it to become clear halfway through that it is completely meaningless drivel. If >99% of LLM-generated content I come across is an utter waste of my time, why should I give this one the benefit of the doubt? Content written in horribly-broken English at least shows that there is an actual human writer investing time and effort into trying to communicate, instead of it being yet another instance of fully-automated LLM-generated slop trying to DDoS our eyeballs.
You wouldn't complain as much if it were merely poorly written by a human. It gets the information across. The novelty of complaining about a new style of bad writing is being overdone by a lot of people, particularly on HN.
I wonder if this code was written by an LLM hahaha
Overall I felt like it solves a problem doesn't exist, and I've been happily sending direct API calls for years to LLMs without issues.
That was more fun than actually using it.
I think structured output with schema validation mostly replaces the need for complex prompt frameworks. I do look at the LC source from time to time because they do have good prompts backed into the framework.
I've talked to many people who regret building on top of it but they're in too deep.
I think you may come to the same conclusions over time.
Ugh. I’m a native English speaker and this sounds wrong, massaged by LLM or not.
“Large blast radius” would be a good substitute.
I am happy this whole issue doesn’t affect me, so I can stop reading when I don’t like the writing.
It's not, it's; bolded items in list.
Also no programmer would use this apostrophe instead of single quote.
Per disclosure timeline the report was made on December 4, it was definitely not the night before Christmas when you were doing the work then.
I would rather just read the original prompt that went in instead of verbosified "it's not X, it's **Y**!" slop.
Not everyone speaks English natively.
Not everyone has taste when it comes to written English.
If you can't be bothered to have a human write literally a handful of lines of text, what else can't you be bothered to do? Why should I trust that your CVE even exists at all - let alone is indeed "critical" and worth ruining Christmas over?
More importantly though, the sheer amount of this complaint on HN has become a great reason not to show up.
And that's completely fine! If you prefer to read CVEs that way, nobody is going to stop you from piping all CVE descriptions you're interested in through a LLM.
However, having it processed by a LLM is essentially a one-way operation. If some people prefer the original and some others prefer the LLM output, the obvious move is to share the original with the world and have LLM-preferring readers do the processing on their end. That way everyone is happy with the format they get to read. Sounds like a win-win, no?
However, there will be cases where lacking the LLM output, there isn't any output at all.
Creating a stigma over technology which is easily observed as being, in some form, accessible is expected in the world we live. As it is on HN.
Not to say you are being any type of anything, I just don't believe anyone has given it all that much thought. I read the complaints and can't distinguish them from someone complaining that they need to make some space for a blind person using their accessibility tools.
Why would there be? You're using something to prompt the LLM, aren't you - what's stopping you from sharing the input?
The same logic can be applied in an even larger extent to foreign-language content. I'd 1000x rather have a "My english not good, this describe big LangChain bug, click <link> if want Google Translate" followed by a decent article written in someone's native Chinese, than a poorly-done machine translation output. At least that way I have the option of putting the source text in different translation engines, or perhaps asking a bilingual friend to clarify certain sections. If all you have is the English machine translation output, then you're stuck with that. Something was mistranslated? Good luck reverse engineering the wrong translation back to its original Chinese and then into its proper English equivalent! Anyone who has had the joy to deal with "English" datasheets for Chinese-made chips knows how well this works in practice.
You are definitely bringing up a good point concerning accessibility - but I fear using LLMs for this provides fake accessibility. Just because it results in well-formed sentences doesn't mean you are actually getting something comprehensible out of it! LLMs simply aren't good enough yet to rely on them not losing critical information and not introducing additional nonsense. Until they have reached that point, their user should always verify its output for accuracy - which on the author side means they were - by definition - also able to write it on their own, modulo some irrelevant formatting fluff. If you still want to use it for accessibility, do so on the reader side and make it fully optional: that way the reader is knowingly and willingly accepting its flaws.
The stigma on LLM-generated content exists for a reason: people are getting tired of starting to invest time into reading some article, only for it to become clear halfway through that it is completely meaningless drivel. If >99% of LLM-generated content I come across is an utter waste of my time, why should I give this one the benefit of the doubt? Content written in horribly-broken English at least shows that there is an actual human writer investing time and effort into trying to communicate, instead of it being yet another instance of fully-automated LLM-generated slop trying to DDoS our eyeballs.