This is emblematic of the LLM race in general. We’re actively pressured to use co-pilot at work, and it’s crammed into every Microsoft product. I’m thankful that my iPhone is old enough not to use LLMs. Companies are afraid of being left behind in the new arms race, but that doesn’t actually mean that the technology actually present use-cases which most people need. (Worse are the meeting summaries or emails which are written by LLMs. The summaries are just not very good, and any sort of LLM writing is a tacit acknowledgement that people don’t really care what they are writing, an that no one is really reading that writing very carefully.)
What I don't understand is, what benefit is there for 99% of companies to get in on the ground floor of LLMs? If you're not developing your own model, you're effectively just beta testing someone else's model. And if the sales pitch of LLMs being able to do basically anything comes true, wouldn't most companies still get the same benefit if they just wait? It seems like a lot of companies are so terrified of missing the boat that they don't sit down and do actual risk analysis.
I’ll tell you why this happens. You might use ChatGPT for a bit and your initial impressions will be great. It does what I ask of it! You might be aware that it makes mistakes sometimes, but when you use it, you don’t notice it because you’re using it interactively.
Now if LLMs are just effective as your experience says, they are indeed extremely useful and you absolutely should see if they can help you.
It’s only when you attempt to build a product — and it could be one person writing one Python script — that uses LLMs in an automated way with minimal human input that you really get insights into LLMs’ strengths and their limitations. You realize it could be useful, but you have to sometimes baby it a lot.
How many people get to step two? That’s a select few. Most people are stuck in the dreamy phase of trying out interactive LLMs.
This is a re-occurring issue with all new technology. Heck it happens with new software frameworks.
The other problem I find is that LLMs are changing so fast, that what you evaluated 6-12 months ago, might be completely different now with newer models.
So the strengths and weaknesses quickly can become outdated as the strengths grow and weaknesses diminish.
When the first batch of LLMs people tried in 2023 had a lot of weaknesses. At the end of 2024, we can see increases in performance in speed and the complexity of output. People are creating frameworks on top of the LLMs that further increase their value. We went from thousands of tokens in context to millions of tokens pretty fast.
I can see myself dividing problems up into 4 groups:
I think a lot of people building products are in group 2 right now.1. LLMs currently solve the problem 2. It doesn't solve it now, but we are within a couple iteration of next generation models or frameworks to be able to solve it 3. LLMs are still years off from being able to solve this effectively so wait and implement it when it can. 4. LLMs will never solve this.
Realism eventually sets in and they move to 3 and 4.
This definitely resonates but I'm left wondering why there hasn't been a collective "sobering up" on this front. Not on a personal/team/company level, but just in terms of the general push to cram AI into everything. For how much longer will new s assault us in software where it ostensibly won't be that useful?
It seems that the effort required to make an LLM work robustly within a single context (spreadsheet, worddoc, email, whatever) is so gargantuan (honestly) that the returns or even the initial manpower wouldn't be there. So any new feels more or less like bloat, and if not fully useless, then at least a bit anxiety inducing in that you have no clue how much you can rely on it.
I can tell you that there has been a lot of sobering up — but that the news isn’t made by those people…
Look up any subject on the web and you a dearth of blogs saying you should do X, Y or Z.
You can ask your own circle but I'm afraid it's not much different.
So it's your job to figure out who actually is worth listening to... all the while you yourself have no experience with it. You have all these camps of people who have varying opinions on a subject and you end up in one of them. The HN camp on AI is a lot more sobering than an ecstatic Reddit subreddit about AI.
Who actually has the real answers? The small slice of people who have tried building something. Everyone else is just listening to their circle.
I said the same thing to a previous company before I was let go. Confused why they were butchering their business strategy in favor of a gold rush.
The main benefit of LLMs was already abundantly clear: literally just chat with it in day to day work when you can. Ask it questions about accounting, other domains it knows, etc. That's like up to 10-20% performance increase on tasks if you align OK.
Still, they were in search of a unicorn, and it was really tiring to be asked regularly how AI could help my workflows. They were not even spending a real budget on discovering "groundbreaking" use cases, meanwhile hounding us to shove a RAG-bot into every product they owned.
The only thing that made sense was that it was a marketing strategy to promote visibility, but they would not acknowledge that or tell us that directly (but still--it was not their business strategy to get NEW customers).
> The main benefit of LLMs was already abundantly clear
In my industry the main benefit (so far) is taking all of our human-legible unstructured data and translating it into computer-legible structured data. Loving it.
Are you able to talk more about that? I’m curious what costs are when you run this at scale. We paid a firm $60k to write a custom parser. We parse around 50,000 pages/month. The parser is 100% accurate and has near $0 continuing costs.
How do you do quality control?
> Ask it questions about accounting, other domains it knows
Be very careful here if you're using it for anything important! LLMs are quite good at answering questions about accounting in ways which are superficially convincing-looking, yet also complete nonsense. "But the magic robot told me it was okay" will not fly in a tax audit, say.
Exactly my immediate reaction. Accounting has to follow very strict rules and needs some application of judgement.
It might answer questions in a useful way, but you have to make sure you understand the answers and that they match accounting standards or tax rules (and one danger, at least in some places, is that they are different and you might apply the wrong one).
I couldn’t be arsed typing a reference number into my online banking for a bill payment the other and it was a copy protected pdf, so I fired a screenshot into Claude and GPT and asked it to extract the details I need and both of them repeatedly got the OCR wrong.
I don’t trust these at all for anything apart from code which I can at least read/rewrite.
It’s quite nice for unit tests I guess. And weird k8s manifests you only write now again like batch/v1 CronJob or whatever.
I’m not panicking about my job just yet..
Talking about what “companies” are doing is misleading because it suggests thst the actions taken will be driven by the medium to long term interests of the company.
In reality, the decisions are made by individual executives and managers with their own interests in mind who are being asked by the people they report to what they’re doing in AI. And this goes all the way to the top where the CEOs are basically required to tell shareholders how they’ve implemented AI and how it’s helping them a bunch.
One of the nice things about being in AI right now is that your customers will advertise it and lie about how useful it has been.
Gotta check the “Does your product have AI” on the Gartner Magic Quadrant surgery.
Stock prices goes up for shareholders when the C-suite declares they are integrating "AI". This is a well made and short video about the strategy, but the short of it is: "AI integration is not for the sake of employees, but investor's stock price". https://www.youtube.com/watch?v=6Lxk9NMeWHg
> What I don't understand is, what benefit is there for 99% of companies to get in on the ground floor of LLMs?
"Mr. President - we must not allow an LLM gap!"
We're still in the buzzword stage where "AI" increases sales more than it decreases them, regardless of what your product is, whether it actually uses AI, or what it uses it for. So if your business sells pickles, better add "now with AI!" because more people will buy your pickles if they have AI than will stop buying it.
(This comment is mainly facetiousness borne out of frustration, but the point stands)
Pure anecdote but in the meetings I have been in eyes glaze over when the SaaS salesman mentions AI in a generic way without providing any examples of how we can use it. The only way it gets any traction is when the technical wingman can answer questions about how it will account for required business logic and permissions.
> what benefit is there for 99% of companies to get in on the ground floor of LLMs?
Most LLM use at the corporate level is happening through Office 365 where Microsoft has put a Copilot button on everything including Word, Outlook and PowerPoint. Execs didn't necessarily ask for it, it protrudes very conspicuously in the UI now.
The thinking is that the trajectory of LLM's will get them an AI flywheel where they can pump money in and get unlimited amounts of intelligence either augmenting or replacing human labor for pennies on the dollar. The business 'thought leaders' on this view it as a largely zero-sum game: get there first or watch your business die as someone else beats you to it.
This has a very late-90's vibe and is quite entertaining to watch.
LLMs are a tool though, so it benefits getting in early and gaining experience using the tool. Companies need to either train up people using a tool, or buy that expertise with the tool. Not to mention all the LLM adjacent tools as well. It's a big, messy, wide field of new software things at this point.
Sure, but does it make sense that your cubicle row of data entry guys are on the cutting edge knowledge of LLM expertise? Because the way LLMs are being pushed in every spreedsheet software would imply that's something someone is asking for.
Cutting edge of LLM expertise? Just using someone's shitty LLM product doesn't require "cutting edge LLM expertise".
First-mover advantage. It doesn't matter if the product works or not, if it doesn't sell because everyone bought your competitor. The product will mature with time, what won't get better with time is your market position if you're seen as being behind the times.
I think you'll find most people in leadership positions at most companies are not that forward thinking, proactive, or frankly intelligent. I thought cost-benefit and risk was analyzed on most big company decisions, until I sat in rooms for a Fortune 500 where those decisions were getting made. If you assume that everyone everywhere is doing just barely the minimum to not get fired, you're right more often than not.
Career risk is also a very real motivation. If you are an executive at a company whose competitors are jumping on the AI bandwagon, but you are not, you will have to justify that decision towards your superiors or the board and investors. They might decide that you are making a huge strategic blunder and need to be replaced. Being proven right years later doesn't do much for you when you no longer have a job. And if you were wrong, then things look even worse for you. On the other hand, if you do get on the bandwagon yourself, and things go sideways, you can always point to the fact that everyone else was making the same mistake.
I think people are laying the groundwork for "true" AI to be plugged in. even if they know it's not currently that effective.
I think for companies to use them is because employees want to use them and there's tons of blog posts (AI written?) that say it increases productivity and you'll be left behind if you don't adopt it.
For startups that basically implement a frontend to chatgpt or similar… well they have no chance of ever being profitable but investors might not know that.
Our competitors are getting in on the ground floor of LLMs.
Do you want to be trying to raise money against them with nothing to tell potential investors about your company's usage of AI?
imwo(w: wrong), it's because just riding along fomo waves is a legit strat. It just doesn't matter if LLM or marijuana serves no purpose or whether it's detrimental or suggested to cause schizophrenia. It's more important for leviathan-class corporates to not be different where it doesn't pay to be different.
I assume they believe in the advantages, to the company, of using these AI tools. Of adding them to the workflow, training their employees, and understanding quickly how it impacts their space.
If you just wait and all of your competitors have a workforce enabled and equipped with the best tools, you are at a disadvantage. It's being the company that put off computerization or automation when everyone else went wild.
And FWIW, I cannot imagine a programmer, in 2024, that remains dismissive of LLMs or the related tooling. While they are grossly oversold by the LinkedIn crowd, and are still a far ways from so-called "prompt engineers" replacing devs, they're a massive accelerator and "second set of eyes", especially if you're doing varied, novel work.
> especially if you're doing varied, novel work.
What do you mean by that? In my experience they mostly help when you are doing mainstream, well-trodden work for which a company’s/project’s/domain’s inside knowledge isn’t needed. Maybe you mean the latter by “novel”?
>What do you mean by that?
Varied and novel to the programmer. If you're doing the same thing day in and day out it probably isn't much use. If you're like many programmers and you're jumping between libraries and languages and platforms and APIs and domains and spheres, it's immensely helpful.
>doing mainstream, well-trodden work for which a company’s/project’s/domain’s inside knowledge isn’t needed
A core disconnect in this discussion is that many people seem to be arguing from the position of "I tried to generate whole solutions with these tools and it failed, so it's useless".
Everything is well-trodden. One of the commentators in here, who declares themselves a unique snowflake where these tools are useless, does "sockets with encryption and messaging", which is some of the most well-trodden ground in this domain. Everything is glue. Everything is a lot of relatively simple things strung together. And for all of those, portions are helped and accelerated with these tools.
It's alright that you like LLMs but you don't need to name-call ("luddites", "snowflakes") those who don't find them as useful as you do.
Neither case are name-calling, and I don't think pearl clutching for effect is useful here.
Everyone is working in unique domains and spaces with weird rules and restrictions and needs. The whole point of the special snowflake comment is specifically that people aren't remotely unique in being a special snowflake.
And the Luddism argument isn't "name-calling", it's an observation of an absolute truth on here.
Everything isn’t well-trodden. I’m mostly doing non-glue, heavily domain knowledge-based work. The main issue I’m running into is that explaining the domain and context to the LLM will take significantly more time than just doing the work myself, and also that parts of the necessary knowledge and most of the source code are NDA-protected, so only a local LLM would do.
Sigh.
Everything is extremely well-trodden at the code level. You're taking some inputs and generating some outputs. You're calling some functions. You're manipulating some strings or lists or sets. You're sorting some things or filtering things. You're building a client to a given API. You're doing some messaging and some encryption.
I guarantee that your code isn't remotely as unique or novel as you think it is. Of course on here everyone is 240lbs, 6'2" and benches 350, and their magically novel, super unique code is just too secret. So everyone's special.
If you had to explain the domain and context to these tools, you might be using them wrong.
"It's all just code, how hard can it be, and your NDAs don't apply" is certainly a take, but some people do actually solve problems whose solutions aren't already on the Internet and are forbidden from exfiltrating code! The fact that you don't solve such problems and are not bound by such NDAs is not a huge piece of evidence, to be honest.
Can you imagine even in principle a piece of evidence that would convince you otherwise?
Who said that NDAs don't apply? If you need to strawman to make an argument, just save the bits and skip doing it.
And yes, in the end all code distills down to shit that is very similar to the countless billions of lines of code on the internet. Everything -- every single project that anyone on this site is working on -- is a bunch of glued together shit where 90% (more like 99% Probably 99.9%) of it is in common with code seen in countless other projects. Utterly regardless of domain or business or project specific uniqueness. Someone would have to be profoundly incompetent to not realize this.
Again, I seem to arguing with people who think that using a tool means feeding it their whole projects and having it rewrite it in Rust or something. In reality a programmer could yield immense value from such tools having a) given it zero lines of their code, b) used zero lines of the code it generated. This isn't a difficult concept, but I'm going to continue getting insane responses by all of the unique people who work on the amazingly unique situations where instead of sorting from A-Z, they sort from Z-A!
You certainly strongly implied that NDAs don't apply with the phrase "their magically novel, super unique code is just too secret". You used the words "is just too secret" sarcastically, which led me to believe you think in fact the work is not too secret to exfiltrate to the LLMs on the Internet: that is, that NDAs don't apply.
> In reality a programmer could yield immense value from such tools having a) given it zero lines of their code, b) used zero lines of the code it generated.
We simply have different experiences of life! I think with the advent of Claude 3.5 Sonnet the LLMs have just about edged out ahead in terms of time saved vs time wasted, for me, but before Sonnet I'm fairly confident they were moderately net negative for me.
Can you give some concrete examples of where they've helped you this dramatically? With links to chat logs? I still don't understand how people are finding them so useful, and I keep asking people and they keep not providing chat logs so I can see what they're doing.
Novel to the programmer.
Even the best programmers have very narrow skills relative to the whole field.
Yeah, I've used 20+ languages and hundreds of technologies and a variety of different types of product and can pick things up quickly. But it's still a drop in the bucket of technologies you can use and types of problem you can solve.
Programmer skills are deep but narrow. LLM skills are shallow but wide. It's an excellent complement for any programmer working outside their deep+narrow expertise.
Gluing together react components /s
ChatGPT thinks 9.11 > 9.9. I'm in no hurry.
I spend way too much of my working life with package version ranges. It took me a minute to understand why this was wrong.
ChatGPT knows about other domains (e.g. software versions) where that inequality is true. Try telling it you’re doing arithmetic.
> ChatGPT thinks 9.11 > 9.9
I've confirmed this asked chatgpt: 9.11 > 9.9 true or false?
True because .11 is greater than .9
Even when ChatGPT starts getting these simple gotcha questions right it's often because they applied some brittle heuristic that doesn't generalize. For example you can directly ask it to solve a simple math problem, which nowadays it will usually do correctly by generating and executing a Python script, but then ask it to write a speech announcing the solution to the same problem, to which it will probably still hallucinate a nonsensical solution. I just tried it again and IME this prompt still makes it forget how to do the most basic math:
Write a speech announcing a momentous scientific discovery - the solution to the long standing question of (48294-1444)*0.3258
4o and o1 get this right.
LLMs should never do math. They shouldn't count letters or sort lists or play chess or checkers. Basically all of the easy gotcha stuff that people use to point out errors are things that they shouldn't do.
And you pointed out something they do now which is creating and run a python script. That really is a pretty solid, sustainable heuristic and is actually a pretty great approach. They need to apply that on their backend too so it works across all modes, but the solution was never just an LLM.
Similarly, if you ask an LLM a chess question -- e.g. the best move -- I'd expect it to consult a chess engine like Stockfish.
> LLMs should never do math. They shouldn't count letters or sort lists or play chess or checkers.
But these aren't "gotcha questions", these are just some of the basic interactions that people will want to have with intelligent assistants. Literally just two days ago I was doing some things with the compound interest formula - I asked Claude to solve for a particular variable of the formula, then plug in some numbers to calculate the results (it was able to do it). Could I have used Mathematica or something like that? Yes of course. But supposedly the whole purpose of a general purpose AI is that I can use it to do just about anything that I need to do. Likewise there have been multiple occasions where I've needed ChatGPT or Claude to work with tables or lists of data where I needed the results to be sorted.
They're gotcha in the sense that people are intentionally asking LLMs to do things that LLMs are terrible at doing. LLMs are language models. They aren't math models. Or chess models. Or sorting or counting models. They aren't even logic models.
So early on the value was completely in language. But you're absolutely correct that for these tools to really be useful they need to be better than that, and slowly we're getting there. If you're asking a math question as a component of your question, firstly delegate that to an appropriate math engine while performing a series of CoT steps. And so forth.
If this stuff is getting sold as a revolution in information work, or a watershed moment in technology, or as a cultural step-change, etc, then I think the gotcha is totally fair. There seems to be no limit to the hype or sales pitch. So there need be no bounds for pedantic gotchas either.
I entirely agree with you. Trying to roll out just a raw LLM was always silly, and remains basically a false promise. Simply increasing the number of layers or parameters or transformer complexity will never resolve these core gaps.
But it's rapidly making progress. CoT models coupled with actual domain-specific logic engines (math, chemistry, physics, chess, and so on) will be when the promise is actually met by the reality.
With general mathematical questions, I've often found WolframAlpha surprisingly helpful.
o1 gets this correct.
And here lies the dichotomy of correctness: Context?
So Indeed 9.11 is chronologically higher than 9.8 and chronology is an extremely common use case.
However a grade F will be given by many.
9.11 > 9.9 is true for software version numbers. For floating point numbers that is false.
ChatGPT 4o gets both of these cases correct for me.
It's weird, "is the following statement about floating point numbers true: 9.8 > 9.11" it works, but otherwise it has no ability to do it with "decimals"
- [deleted]
Javascript thinks that 11 < 3 but its still kinda useful anyway from time to time:
> [11,9,1,3].sort() [ 1, 11, 3, 9 ]
If you want to know whether javascript thinks `11 < 3`, then just evaluate it directly. There is lots of dumb stuff in JS IMO, but be honest about it.
You're getting downvoted because your blatant attempt at language wars has a very simple, logical explanation. If you wanted to use a 'gotcha', there are far better examples.
? My calculator does too. Unless you mean (9, 11) > (9, 9) which is an entirely different thing.
You should get your calculator checked. 9.11 is definitely less than 9.9
I imagine they are not from an anglophone country and see 9.11 as 9*11
> And FWIW, I cannot imagine a programmer, in 2024, that remains dismissive of LLMs or the related tooling.
Hi.
> While they are grossly oversold by the LinkedIn crowd,
That is true.
> and are still a far ways from so-called "prompt engineers" replacing devs,
Also true.
> they're a massive accelerator and "second set of eyes", especially if you're doing varied, novel work.
That is not even remotely my experience.
Like I can envision a programmer who would get benefits from it, but bluntly put, the code I work on day to day is far, far too interesting to be handled by copilot, simply because there aren't nearly enough stackoverflow pages about it to be scraped. Honestly if you found yourself able to automate most of your job with copilot, if anything, you have my sincerest condolences. I can't imagine how utterly bored you are in your day-to-day.
IF copilot could get to a place where it could understand, comprehend, and weigh-in on code, that would be incredibly useful. But that's not what Copilot is because that's not what transformers are. They are fancy word probability calculators. And don't get me wrong, that has uses, but it is nothing I'd be comfortable calling a second set of eyes for for anything, save for maybe writing.
It's interesting that there is this division between programmers who claim LLMs are super helpful, and those saying they are useless.
While it's certainly possible that this divide is based on how 'hard' the problems people are using them on, my current theory is that some people use them like the proverbial rubber duck - in other words, a way to explore the code, and generate some stuff to work on, while thinking through the problem.
Personally, I have not yet tried it, so I'm curious which side of the discussion I'll fall ...
I think young programmers who are less heavily invested in their skills and who haven't built a life that's highly dependent on using them are generally more interested in figuring out what programming with LLMs means.
But so are much older programmers who have seen it all, including the obsoletion of many of their skills, and who are not so dependent on continuing to use them as they could retire anyway.
It's more the middle (programmer) age senior programmers who are less likely to see any use.
I've seen the same pattern with artists' interest in generative AI.
But it's complicated because it IS also dependent on what you're doing. So it's hard to know if something is being dismissed correctly due to domain/expertise, or prematurely due to not putting the work in and figuring out what these tools mean.
This really touches on it. I'm a big advocate of these tools, but they author approximately zero lines of my code, yet I still find them invaluable and a wonderful tool to leverage, and do so constantly. Particularly in challenging projects and needs.
I suspect many who find them useless and decry them were sold an exaggerated utility and then were disappointed when they tried to generate libraries or even functions, then feeling deceived when there are errors or flaws, etc.
> I suspect many who find them useless and decry them were sold an exaggerated utility and then were disappointed when they tried to generate libraries or even functions, then feeling deceived when there are errors or flaws, etc.
No, I suspect the large majority (and this has been backed by surveys) of people that are dismissive of them are more senior and have been working in highly specific problem domains for a long time where this is rarely/never a good "general" answer for a problem, and have spent an inordinate amount of time debugging LLM-generated or LLM-Advised code by their peers that contains nefarious and subtle errors that look correct at a glance. I personally can tell you that for what I work on, in my domain, these tools have been a net time suck and not a gain, and I pretty much only use them to ask questions about documentation, which it often gets incorrect anyway (again, in subtle ways that are probably hard for someone who isn't very senior to detect).
Hope that helps.
Yes, absolutely, they’re an ideal rubber duck and I’ve come to really value them for my work. Checking your sanity, pondering how certain operations might be implemented or how they could be optimized, finding where a logic bug might be in a snippet of code…
> It's interesting that there is this division between programmers who claim LLMs are super helpful, and those saying they are useless.
My take is: if the project is doing something that has been asked a thousand times on stackoverflow and has hundreds of pages in the tutorial content mills, the LLM will tell you something reasonably meaningful about it.
I'd hazard a guess that most people overenthusiastic about those tools are gluing together javascript libs.
This is not necessarily a bad thing, we even asked a LLM today at work to generate some code for a library that we didn't know how to use but seems fairly popular, and the output looked like it would make sense. (Can't tell you how it ended up because I wasn't the one implementing the thing.)
However, we also spent 2 hours in a group debugging session because we're working on a completely custom codebase that isn't documented anywhere on geeksforgeeks, stackoverflow or anywhere else public. I highly doubt that even a local LLM would be able to help, and no way this code is leaving the premises.
>if the project is doing something that has been asked a thousand times
There are many billions of lines of high-quality, commented code online, covering just about everything. Millions of projects. All of Linux. All of Android. All of PGSQL and SQLite and MySQL and Apache and Git and OpenSSL and countless encryption libraries and countless data tools, video and audio manipulation, and...
Every single project is absolutely dominated by things that have been done many, many thousands of times. The vast bulk of your projects have zero novelty. They're mixing the same ingredients in different ways. I would think any experienced developer would realize this.
>I'd hazard a guess that most people overenthusiastic about those tools are gluing together javascript libs.
At this point it's comedy how often this "oh I understand that the noobs get value from this, but not us Advanced Programmers". It's absurdist and honestly at this point I just shake my head. My day is filled with C++, Python, Rust, Go, the absolute cutting edge of AI research, and I find these tools absolutely invaluable now. They are a massive accelerator. Zero JavaScript libs or "LOL WEB DEV" programming in my life.
Yes, you mentioned those things that are documented everywhere. I do use LLMs to give me skeleton code for those parts I'm not familiar with.
How about a full equivalent of Qt that is proprietary and has absolutely nothing public in it? How is a LLM going to help with that? There is no public info anywhere.
> the absolute cutting edge of AI research
No offense but there are billions of public pages about "AI" research since it's the new gold rush. Of course LLMs have material about all your libs.
>but there are billions of public pages about "AI" research
Billions? For many of the things I am working on there are zero public pages outside of research papers. I said nothing about working with libs. Again, I'm not asking an AI "here's my project now finish it", I'm working with AIs for the countless little programming challenges and needs. Things that mirror things done in many, many other projects, most having nothing to do with my domain.
As an aside, starting that with "no offense" as an attempt to make it insulting is...weird.
I feel like this discussion is taking place ten years ago. The weird reference to StackOverflow is particularly funny.
Some people seem to be taking me saying "you must work on boring code" as a judgement against them as developers, and it isn't. I'm speaking directly from my experience: If I asked it beginner-tier questions about how to do X in Y language, it would get those right quite often. I could see Copilot being very useful if you're breaking into a new language, or just knocking the rust off the gears of one in your head.
And like, even for those who write a lot of boring code, like... cool man. I don't judge people for that. We need all code written and all code is not exciting, novel, or interesting and there's nothing wrong with doing it. Someone's gotta.
I'm just saying that the further up the proverbial complexity chain I went, the less able Copilot was. And once I was quite in the weeds, it seemed utterly perplexed and frankly, not worth the time in asking.
>as a judgement against them as developers
No one takes it as a judgment, and no one is offended. It's just a truth that when people make such claims, they're often exaggerating the uniqueness or novelty of what they're doing.
You described your work in another comment, and what you described is the most bog standard programming in the field. It's always the case.
Yeah, at least 90% of any job is just making license plates. I have worked on both very complex and challenging code and also very simple and easy code in my career, even within one job.
There is a lot to unpack here, but is your code so interesting that it defies understanding by an AST? Code models are trained to semantically represent code, so unless you use semantics that exist outside of software engineering the claim that your code is too unique for llm is false.
Maybe you are imagining a case where the entire codebase is generated by a single prompt?
An abstract syntax tree is not semantics. And language models don’t do this kind of explicit syntax parsing and representation at any rate.
I'll admit I haven't seen the training data but some basic googling shows a subset of the labeling is syntax annotations. I am not claiming LLMs parse code in the way you are suggesting, but they certainly have token level awareness of syntax and probable relations which are the roots of any programming language.
The shorter and also "not breaching confidentiality" answer I can give is we're dealing with setting up custom sockets over incredibly long range wireless connections that require clear and verified transmission of packets of data, rolling both our own messaging protocol and security features as we go.
When last I tried anyway, Copilot was, frankly, useless.
Fwiw, copilot is not a particularly powerful LLM. It's at most glorified smarter autocomplete. I personally use LLMs for coding a lot, but Copilot is not really what I'd have in mind saying that.
Rather, I'd be using something like the Zed editor with its AI Assistant integration and Claude Sonnet 3.5 as the model, where I first provide it context in the chat window (relevant files, pages, database schema, documents it should reference and know) and possibly discuss the problem with it briefly, and only then (with all of that as context in the prompt) do I ask it to author/edit a piece of code (via the inline assist feature, which "sees" the current chat).
But it generally is the most useful for "I know exactly what I want to write or change, but it'll take me 30 minutes to do so, while with the LLM I can do the same in 5 minutes". They're also quite good at "tell me edge-cases I might have not considered in this code" - even if 80% of the suggestions it'll list are likely irrelevant, it'll often come up with something you might've not thought about.
There's definitely problems they're worse than useless at, though.
Where more complex reasoning is warranted, OpenAI o1 series of models can be quite decent, but it's hit or miss, and with the above prompt sizes you're looking at 1-2$ per query.
>Hi.
I was being dismissively rhetorical. I can actually imagine them because we see them on HN constantly, with a Luddism that somehow actually becomes a rather hilarious attempt at superiority. A "well if you actually get use out of them, I guess you're just not at my superior level of Unique Projects and Unique Code". It's honestly just embarrassing at this point.
>but bluntly put, the code I work on day to day is far, far too interesting to be handled by copilot
Hilarious. It's a cliche at this point.
Let me take it further and turn this on its head: The people who usually think LLMs aren't valuable to coders generally work on the most boring, copy/paste prattle. They stick to the same tiny niche all day every day. They're basically implementing the same thing again and again. It's so rote, and they're so profoundly unchallenged, that tooling has no value.
>with a Luddism that somehow actually becomes a rather hilarious attempt at superiority
>The people who usually think LLMs aren't valuable to coders generally work on the most boring, copy/paste prattle.
Here you are doing the same thing, aren't you?
Instead of calling people names, the biggest tell of a weak argument, why don't you explain the type of work you do and how using an LLM is faster than if you coded it yourself and and/or also faster than any current way of doing the same thing.
I'm assuming you are a senior+ level coder.
But...I'm not doing the same thing. In actuality I'm saying I'm a fairly typical programmer in a common situation: I work across a variety of languages and platforms and toolings and projects, building solutions for problems. The truth is that extraordinarily few programmers are working on anything truly novel. Zero of the readers of this comment are, in all likelihood. The ridiculous notion that someone has so unique of a need that it hasn't been seen is kind of hilarious nonsense. It's the "I'm so random! Other girls aren't like me" bit.
>Instead of calling people names
Who called anyone a name? Luddism? Yes, many HN participants are reacting to AI in a completely common rejection of change / challenge, and it recurs constantly.
>how using an LLM is faster than if you coded it yourself
I am coding it myself. Similar to the other guy who talks about putting an LLM in "charge" of his precious, super-novel code, you're setting up a strawman where using an LLM implies some particular scenario that you envision. In reality I spend my day asking questions, getting broad strokes, getting code commented, asking for API or resources, etc.
I suspect you're overstating the degree to which an LLM might be unsuitable for some types of work. For example, I'm a data scientist who works primarily in the field of sales forecasting. I've found that LLMs are quite poor at this task, frequently providing answers that are inappropriate, misleading, or simply not a good fit for the data we're working with. In general I've found very limited use in engaging LLMs in discussion about my work.
I don't think I'm calling myself a super special snowflake here. These models are just ... bad at sales forecasting.
LLMs aren't entirely useless for me. I'll use ChatGPT to generate code to make plots. That's helpful.
I would never recommend an LLM for sales forecasting. It's just the wrong tool for that job.
>Who called anyone a name? Luddism?
Sorry, perhaps I misinterpreted it.
>In reality I spend my day asking questions, getting broad strokes, getting code commented, asking for API or resources, etc.
Can you give me some concrete examples? I'd like to use it, but I'm currently of the mind:
I mainly stick to back end work, automation, building WebAPI's and DSS engines for the medical field.1. If it's boring code, I can write it faster than asking LLM to do it and fixing its issues. 2. If it's not boring code, like say a rules engine or something, I'm not sure the LLM will give me a good result based on the domain.
Maybe I'm under and over thinking it at the same time. FWIW, I typically stick to a single main language, but where I usually work, the companies dictate a GP language for all our stuff: C# in my example. I do a small amount of Python for LLM training, but I'm just starting out with Python. I can see it being useful saying, "convert this C# to Python," but honestly, I'd rather just learn the Python.
> Who called anyone a name? Luddism? Yes, many HN participants are reacting to AI in a completely common rejection of change / challenge, and it recurs constantly.
You should read up on what Luddism and Luddists were actually about. They didn't think the machines were evil or satanic, which is the common cultural read. They assumed (correctly) that the managerial class would take full advantage of increased productivity of lower-quality goods to flood the market with cheap shit that would put competitors out of business, and let them fire 4/5 of their workforces while doing so. And considering the state of the textile industry today, I think that was a pretty solid set of projections.
Luddites didn't oppose automation on the basis that machines are scary. They were the people who worked the machines that already existed at the time, after all. They opposed them on the basis that the greedy bastards who owned everything would be the only ones actually benefiting from automation, everyone else would get one kind of shaft or another, which again: is exactly what happened.
This, actually is incredibly analogous to my opinions about LLM. It's an interesting tech that has applications but is already being situated to be the sole domain of massive hyperscalers and subject to every ounce of enshittification that follows every tech that goes that way, while putting creatives, and yes some coders, out of a job.
So yes, it was name calling, but also I don't object to the association. In this case, I'm a Luddite. I am suspicious of the motivations and the beneficiaries of automation being forced into my industry and I'm not going to be quiet about it.
> the greedy bastards who owned everything would be the only ones actually benefiting from automation, […] which again: is exactly what happened.
What also happened is that everyone can buy clothes incredibly cheaply. Which seems like a widespread benefit.
>You should read up on what Luddism and Luddists were actually about.
They were primarily opposed to automation because it devalued the work they did and the skills they held. That is the core essence of Luddism. They thought if they destroyed the machines, automation could be stopped. There were some post-facto justifications like product quality, but if that was true they'd have no problem out-competing the machines.
Yes, it is Luddism that drives a lot of the AI sentiment seen on HN, and it is not only utterly futile and basically people convincing themselves and each other while the world moves on. There is no "name calling", and that particular blend of pearl clutching is absurd.
imho a lot of "luddism" so labeled by pro-ai bros is just people furious about shoddy artefacts that degenAI produce. That compares to original luddism, difference being the original 19th-century opposition against industrial revolution had proven them wrong with improved quality whereas genAI hasn't.
>And considering the state of the textile industry today, I think that was a pretty solid set of projections.
I think it's just about all industries these days.
Yes, so many quote meanings have been malformed over the years, such as "a rolling stone gathers no moss," is considered good, while originally bad. "Blood is thicker than water," "Money is the root of all evil," etc.
The Luddites were right.
That was a whole lotta words to say "Nuh uh." and as such I don't really have a response.
- [deleted]
Imagine it. I’m dismissive. The current crop are mediocre interns.
I've brought MS Copilot licenses for my company in ~ February 2023. They were sold in a yearly commitment and offered no trials. A bad deal, but I was afraid of missing out AI productivity gains.
I'm definitely not renewing those. Times are hard and the value provided does not justify the cost.
I wonder how many companies will do the same.
In some cases it is useful, like in Excel where it can generate formulas or describe an approach to a problem. Not different than GH Copilot. The same for MS Power Automate editor AI assistent.
But in other cases like Word or Outlook it is just a louzy summarizer and does not have much added value.
That's surprising. I haven't met a single user who found Copilot useful in Excel.
I've found some that enjoyed using it a "smart auto complete" both in Word and Outlook, however.
Most of my users are financial analysts.
I've found Copliot to be really good at generating funny memes and haikus for specific issues/tasks where I work. The productivity gains come from me not having to use Photoshop and have more time to browse websites like this one.
I recently personally resubscribed to copilot after cancelling my subscription a couple months ago since it was not providing value. But now with the new/beta “Edit” mode and being able to specify to use o1, o1 mini, and sonnet 3.5, the $10 a month feels a lot more worth it. The edit mode has outperformed aider for me.
Sorry, I think we are talking about different stuff.
I was referring to Copilot for M365, which costs 30 USD / month and targets biz users.
Github Copilot is definitely worth it.
Oh wow, I definitely misunderstood, all the “Copilot” usage by MS/Github got me mixed up
Interesting because I pay for Copilot and Supermaven, although most of my coworkers don't. To be fair, it was twisting their arms to get them to use linters, formatters, and other tools so I think asking them to use AI auto-complete is a bit much right now.
> any sort of LLM writing is a tacit acknowledgement that people don’t really care what they are writing
Couldn’t be further from my experience. I work with a distributed team that has four different native languages, with substantial engineering staff for each.
LLMs have made our cross-geo communications so much better. I can get a summary of a standup that happened overnight (to me) and in a foreign language (to me) when I get up in the morning. I can write idiomatically and naturally without worrying it will be confusing or misinterpreted.
Plenty of people are not naturally talented writers, especially in your language of choice. LLM writing doesn’t mean they don’t care, it means they DO care about communicating clearly.
> I can write idiomatically and naturally without worrying it will be confusing or misinterpreted.
Note that this depends on the language. LLM translations aren't that much more accurate than predecessors other than that it's more strongly constrained to grammatical English, and there's limit to how natural one could be through translations, depending on how much cultural co-development your own background had with the target environment - your framing and storytelling will sound off even through human translators if the cultures are too far apart.
LLMs tends to be a bit more robust than previous MTs, that's for sure.
I like that--generally I agree that LLMs inflating text is not adding value, but it is a great point that LLMs can help bridge language barriers.
How accurate are the summaries?
This is what worries me. I'd say that >50% of the time, Copilot-generated summaries of meetings in which I was presenting misinterpret what I said in some important way.
I'd much rather (and do) take a couple minutes to write my own take-aways. It's the golden rule in a business context: I'd prefer to receive three one-sentence bullet points that are actually accurate over a couple pages of AI slop, so that's what I give to my colleagues.
The cross-language factor is an interesting angle I haven't had to contend with, though.
I need to do that kind of thing all the time—-and it annoys me to no end when people post the summaries in chats to “catch up” on a meeting, because I know they’re wrong. As an European who understands five languages well and can take some solid hints in many others, nothing beats actually listening or scanning a transcript.
I turned on Apple Intelligence to summarize notifications from Outlook, Teams, etc. I haven't found it to be very accurate yet, especially for MS Teams.
Not OP, but I'll answer from my experience trying several different tools for this: the good ones are roughly as accurate as having a human note taker, familiar with the domain terminology, would be on average.
They are quite accurate but the problem is that they sometimes leave out a critical piece of information said in the meeting that doesn't have a lot of repetition but that turns out to be like a crucial factor in the meeting and the LLMs completely miss that.
Do you use an specific tools to streamline the translation or format of your communication? I'm facing the same problem, but wonder how well an LLM would be at translating / understanding the domain specific terminology that we use.
> The summaries are just not very good, and any sort of LLM writing is a tacit acknowledgement that people don’t really care what they are writing, an that no one is really reading that writing very carefully.)
I tend to write emails aggressively. The friendly option is helpful to tone down the rhetoric.
I do the same! Lately when something really infuriates me, I tend to lean into it, write the email with insults and all, and let chatgpt make it into something that won't get me fired.
> We’re actively pressured to use co-pilot at work
Unless you work at Microsoft, that's really interesting to me. Can't say I've ever experienced a boss pushing for me to use a certain developer tool.
Makes complete sense - some higher-up is betting on LLMs to make a big change to the org. If they're right, then leadership looks amazing (we got in early on a transformative new technology/tool that has resulted in major productivity wins!)
If they're wrong - they also look good (we tried the latest fad, pushed everyone to really give it a go, and decided it wasn't worth it).
No trial period for CoPilot means everyone wants to get their team to jump in and make the most of it quickly so they can evaluate whether to renew it next year.
I've had a little bit of that in a previous startup. It's easy to explain: there's so much content around "AI is going to increase productivity", and "if your company doesn't use AI it won't be competitive". AI evangelists are everywhere, from the internet to their direct network, reinforcing the idea. Investors are also influenced the same way, and put direct pressure on the leaders.
So it results in the CEO telling everyone that they should be using copilot (amongst other tools) because it'll improve their productivity, without knowing for a fact whether that's true or not.
Execs are simply following the manufactured consensus about AI's impact on productivity and innovation. Go with a name brand like Microsoft and get that promotion for digital transformation or whatever. Nobody ever got fired for buying IBM.
I have many friends telling me their job is actively pushing for them to use AI, asking them to be creative about the use of AI, and in some cases pivoting the business to AI products which have no profit margin for the business.
I have worked at many places, and often as soon as there's enough profit and momentum based on the back of some core product the fad chasing begins in earnest.
Suits get to then do talks about how awesome they are for their AI transition and the synergies it brought to their OKRS.
Or some bs like that.
Work at one or two shops where an expensive product makes the org's life a living hell but two directors or similar get to do a talk about how it has 'transformed' IT. It will all make sense then.
I've experienced this, it mostly comes with eye-wateringly expensive yearly contracts the company has signed where whoever made the decision needs to show that even low single digit percentages of company staff are using the thing they received a license for.
I assume in some (poorly-run) companies, the person whose Big Idea (TM) it was to introduce this stuff may push people to use it, so that it seems like their thing was successful.
Not OP or at Microsoft, but also get the same pressure. Our business is financially invested in the hype, so it makes sense.
Less about being able to truly say a lot of people use the thing/find benefit. The first part is fine
It's pretty common in large orgs, I think. Someone gets sold on or just excited about the features claimed by a particular product and becomes the czar for that thing. They want to make the transition to that product successful for their own career reasons. If people aren't using it, then the purchase was a bad move, so people start getting pressured to use the product regardless of how useful it is.
Year ago, my company switched from Skype for Business to MS Teams, and we just used Teams the same way we did Skype for Business (PMing each other directly) and the guy who made the transition was always lecturing us about having conversations in public threads instead.
I'm in a large org where Copilot is pushed. What I've heard from upper management is that even a very small average productivity gain, multiplied by the number of developers we have, means that we have "saved" x number of man-hours per year. Man-hours can be translated into "number of employees", so if the Copilot license comes in under that employee salary total, it is a win.
Maybe because nobody really asked for these things gemoji and "rewrite my email".
Neural nets are a foundational technology there should be a ecosystem of apps based on it. People pay 1500$ for an iphone because it has apps, not to call and text people. I think its coming and Apple can't just NOT do AI, so I get it, but it will take a little time.
The Zoom AI summary seems to work pretty well for me.
Personally I've found Copilot's meeting summaries to be really good. They're definitely a lot better than most minutes I've ever seen taken contemporaneously.
The crazy thing to me means that people are okay with recording 100% of every meeting they ever have. I've been in plenty of meetings where you just don't want things recorded, and no, that does not mean the meeting was shady.
Other than legally sensitive topics, what meetings would you not want recorded? Do people take notes or are they just totally ephemeral?
Sometimes people want to express things (frustration at deadlines or upper management or skepticism of projects or announcements for example) and not have it carved in stone. I purposefully allow some time at the end of presentations or large team meetings for a more informal non-recorded Q&A for that reason.
The number of meetings I attend where no notes are taken dwarfs the number with notes taken. Clearly, I must live in a different universe from the rest of everyone here where notes are taken at every meeting
IMO there is no advantage to having "community" notes taken of a meeting: I take my own notes about decisions made in the meeting that I care about, and items I need to take care of. Therefore, every meeting I'm in has notes taken, though no one else is even aware (unless they are taking their own, which I assume many are). I think this is relatively common (since I can't see how one can function in most knowledge work without some version of this) so I would assume that you probably are in meetings where notes are taken, even if you never see them.
This is moving the goal posts, though, and you know it.
In a meeting with "notes taken", there is someone tasked with doing so in an official manner usually decided by the person coordinating the meeting. It is also typically known by the meeting attendees that this is the case. Those notes are usually made available after the fact in whatever form is normal for that company. In Zoom type meetings, we all get the "recording" notice during the meeting.
What individuals do is not the point of this thread. However, we've seen plenty of examples of where individuals do record on their own devices meetings which were meant to be private but leaked later. Again, that is out of scope for the general purpose of this. Again, you know this. Or at least, you should
Sorry, wasn't trying to move any goalposts: you're replying to my first comment in the thread. I've never experienced the situation you describe, and literally didn't know that this was a practice. I've seen formal meetings run by Robert's Rules, which have a secretary keeping "minutes", which are a subtly different thing and are generally not useful for anything other than official purposes.
I've also seen meetings where some enterprising project manager says they will "make notes available". Sometimes they even do this, but even when they do, the notes are useless for everyone who isn't them -- they've just distributed their personal notes in the hopes that somehow they would be useful to others in the meeting, which of course they aren't.
Have a nice day.
We're not allowed to record meetings at work, with some exceptions such as trainings. I'm not at all sure why, but I believe it has to do with merger plans. We're not working on anything particularly shady, to my knowledge.
Notes are often taken but not always. Depends on the people in the meeting.
I think LLMs work best when you go to them and know what you’re getting and how hard you’re going to work to get something good.
When it’s just there… you expect something better than that and you can’t quite “work it” the same as when you’re iterating through the native interface, or API.
I use LLMs a lot, all day long, but I’ve disabled Apple Intelligence.
[I’m also salty how bad typing still is on an iPhone and they’re messing with this stuff.]
The fact that the GPT-style models are literally autocomplete engines, while text autocompletion on phone keyboards remains terrible, really maddens me. Is it that they simply aren't using that style of model for autocompletion, or that the GPTs aren't great at guessing me specifically even though they're superhuman at autocompleting the Internet, or that it would be too slow to run a sufficiently good GPT for this on-device, or what?
I fear this may be the best auto correct will ever be.
> Companies are afraid of being left behind in the new arms race...
Sometimes it takes all the running you can do, to keep in the same place.
I'm desperately waiting for co-pilot to be implemented at my company for then productivity gains. For example, I assume I'll be able to do the following:
- Select text in any report I create and say "rewrite this in a professional tone"
- I write a sloppy email with a few bullet points of what I'm trying to get to and have it rewrite it in my tone
- Immediately after a call about a new project, present notes of the call with next steps, keep agreements made, etc.
- Search Outlook for specific things like "Kevin from PE Firm LLC said he wanted this report by which date, can you tell me?"
The papertrail has only value, in dire situations- the only one reading it, editing it and tracking it, are people complicit, explicit or implicit in cooperate crimes.
> Worse are the meeting summaries
These can be quite _funny_, sometimes, but they're certainly not useful, yeah.
Do you use a search engine? Do you think search engine results pages are more efficient than getting an answer from an LLM?
Apple’s LLM is uniquely useless, because it doesn’t do anything LLM like and Con continually returns “here’s what I found on the web”. Actual consumer LLM apps like Microsoft Copilot are leagues ahead.
I disagree about the summaries. I find them to be vey good and helpful.
I teach a high level course and each week have the transcript summarized for students; also get feedback on my lecture and help creating the lesson plan for the following week.
My father was Dean of one of the top teachers colleges in the world and has been blown away by how useful LLMs can be for education.
I hope you check them. AI meeting summary things are... well, not averse to making shit up, or getting things exactly wrong.
LLMs could and should revolutionize education, with proper human supervision and fact checking since they regularly make shit up too
At companies where the average worker is less intelligent than the mean author of the mean piece of online content used in an LLM's training set, the output from the LLM might be more clever or more well-written than what the average worker at that org would generate themself.
At companies where the opposite is true, every LLM output feels like a shittier version of what an employee could have written, like the average Redditor's comment on any given situation.
> I’m thankful that my iPhone is old enough not to use LLMs.
Well, too bad for you, as a 3rd party we had no reason to artificially restrict our handsfree AI voice companion to new iPhone models: https://itunes.apple.com/app/id6737482921?mt=8
(shameless self-promotion)
3 years ago, I and many people were using GPT-3 on our iPhones and Apple Watches (see for example https://www.youtube.com/watch?v=tDB3uIEgU0E&t=36s). For 3 years I wondered why Apple wouldn't make it a native feature.
- Was it because they didn't want to use a third-party AI system? No, because Apple Intelligence uses ChatGPT anyway.
- Was it because Apple didn't know how to add AI features to iOS? No, if anything they have access to way more APIs than I had when I built the iOS Shortcut in the video.
- Was it because Apple didn't take AI seriously? This to me sounds silly but in retrospect, I'm starting to believe this was exactly the case. Apple has been on a wrong trajectory for the past few years:
- They were busy building Apple Car. Then when the AI pressure finally kicked in, they stopped the car project. Why were they in the car business in the first place?! And why did they simply tell most Apple Car employees to work on AI now? Cars also have AI but it's not the same thing.
- They built Apple Vision Pro but have failed to showcase any real-world use case for it yet. Apple Intelligence isn't even supported on the AVP. I don't want to call it a failed project but by many measures it is.
- Apple, and specifically Tim Cook, are great at optimization: from the supply chain to setting optimal prices that extract the most value from customers. But they forgot what allowed them to do this in the first place: a decade-long vision set by Steve Jobs.
Current day Apple isn't going to disappear, but it's becoming less and less relevant in the broader tech landscape. In a sense, Apple has become the McDonald's of handheld gadgets. But if I want the latest tech, the most interesting use-cases of AI, or simply ground-braking devices no one has heard of, I'd go to a French/Persian/etc. restaurant instead of Apple's McDonalds.
Thanks for the obligatory “Apple is doomed” comment.
I get it, and odds are one day it will be true so no harm trotting it out decade after decade, but I don’t think that time is now.
- Watch and AirPods have both been incredible successful and were pure Cook-era Apple. Both were first envisioned after Jobs died.
- Car was a failure, but it seems odd to criticize Cook for being a mere optimizer when he enthusiastically spent billions on the project. Which is it? No vision, or too-ambitious vision?
- Agree that AVP has been disappointing. But again, so what? It’s similar to the Newton. Not every product will succeed. And again it’s hard to see AVP as a “McDonalds” effort.
I don’t know what it is about Apple that attracts these “they are doomed if every product isn’t as revolutionary as the iPhone” takes. Who else is having world-changing home runs with every product?
Totally agree with you and you didn’t even count their insanely great M chips that even have the interest of people running Linux and servers.
Those chips are leading the industry in performance per watt and other metrics, and are fascinating as chips, not just Apple or ARM chips. Their laptops were “fixed” and many, including myself think they’re the best laptops we’ve ever owned.
I feel the butterfly keyboard laptop era was when they were lost in the woods. But now they killing it. Even Apple Maps is great.
Thanks for misinterpreting my comment. /s
I said Apple is not gonna disappear (i.e., it's not doomed), but it's less and less relevant in the tech space. But you do you.
"We’re actively pressured to use co-pilot at work"
Holy shit, really? That sounds horrific.
Not that different from any other buzzword hype bubble technology that clueless easily-manipulated management got sold on and imposed on everybody.
> The summaries are just not very good
That's the fault of the original author, who should have considered that their work may be summarized by AI, and should have written to that audience in the first place. /s
So, surveys.
"Smartphone users in general are unsatisfied with the existing AI features as the survey recorded 73% of Apple Intelligence users and 87% of Galaxy AI users stating the new features to be either ‘not very valuable’ or they ‘add little to no value’ to their smartphone experience."
This could be read as 27% of users, who have tried a brand new way of interacting with their phone, found it useful to some degree right away. Imagine: You introduce a brand new, well, anything into peoples lives at Apple scale, and 27% of them find it useful, right away.
I don't know what number people expect, but I think you could consider this an outrageous success. I am not saying it should be read that way. But then again, I am not sure why it should be read any other way.
Polls are weird like that, especially for things that have strong brand affinity, like with soda or politics. If Apple falsely announced that they had a new 4D holographic battery and charged an extra $100 for it in new phones, I bet about 25% of users surveyed would report being satisfied with it.
> This could be read as 27% of users
Not necessarily. Poll results are not %Pro + %Con = 100%. It could equally be likely that 83% find no use and 27% have not tried it leaving 0% liking it.
The data is in the linked survey: https://www.sellcell.com/blog/iphone-vs-samsung-ai-survey/
- 11.1% of iPhone's users felt the AI features were "very valuable"
- 5.9% of Samsung Galaxy's users felt the AI features were "very valuable"
So what's the 3rd category since the numbers do not add to 100?
11.1% Yes, they're very valuable 15.9% Somewhat valuable but not significantly valuable 64.7% Not very valuable, other features are more important 8.3% No, they add little to no value
It does say 73% of "Apple Intelligence users". So I would assume that 100% of them are not just iPhone owners but people who have actually used Apple Intelligence, to some degree.
I agree. My iPhone usage is mainly to check email/whatsapp/calendar and my banking apps. Rarely take pictures or record videos.
Since my company uses exchange, email and calendar cannot be accessed outside the outlook app, not even by the iOS widgets which renders them next to useless for me.
I personally know people that would claim a phoneless phone is useful, right away, without having used the device once, as long at it was released by Apple.
To be clear, I think a "phone" as in smartphone, that can't make calls would be mostly as useful for me, personally.
But you might be claiming "people would buy an empty box, if apple put its logo on the box"
Like an ipod touch!
We have a family friend who is blind and a programmer. It's interesting to hear his perspective. His hope and expectation are that it will greatly increase usability.
I've been thrown into the usability deep end due to my wife also losing her sight due to an autoimmune disorder, and my dad losing his sight due to Macular Degeneration. Honestly, it sucks, and I mean like rage quitting, phone throwing sucks. (Try it. Turn on voice assist and close your eyes.) If Apple can improve it through AI, where someone can just talk to the phone to do a series of tasks. It will honestly change everything. The number of aging people who are going to lose their vision in the U.S. is set to go up exponentially in the coming years. This could be an unprecedented win for them, if they solve this issue with AI.
I don't have experience with this kind of problem. But I don't think GenAI is the best tool for this, at least not until it's so rock-solid trustworthy that everyone uses such an interface. Even leaving aside AI questions, if I'm looking for a human personal assistant for someone who's blind, and that person will have unlimited access to their electronic life, I'm going to vet that person very, very carefully.
I don't understand the point.
Apple users already let apple (or at least their device) know everything about them.
If a person is blind and can't read or type onto their phone, a tool that can reliably pull up messages app and send Dad a letter is a godsend.
My point is that the user is adding another layer of abstraction, and that layer of abstraction needs itself to be trusted. When UI elements are really concrete and you can clearly see that you pressed a particular button and the thing you wanted happened, then the UI layer, at least, is a nonissue.
But in retrospect I don't know if my point was that good. The UI problem hasn't actually been solved, and an LLM-based chatbot may actually be more reliable for non-tech users since the user has to do less translation.
A few days ago, OpenAI released live video integration with Advanced Voice mode for ChatGPT—point your phone at something and ask what it sees, and it will tell you pretty accurately. I thought it was just a cool trick until I read the top comment on their YouTube video announcement: “I'm screaming. As a visually impaired person, this is what I was eagerly waiting for. Still screaming! Thank you, Sam, Kev and the entire team over at OpenAI.”
https://www.youtube.com/live/NIQDnWlwYyQ
Google released a similar feature with Gemini 2.0 last week. While it doesn’t seem to be integrated with a smartphone app yet (at least on iOS), it can be used through the AI Studio browser interface.
Is this feature somehow different than what Google has had with lens and what Apple has had with the info button in regular photos for a while now?
It uses the live video feed, and you can talk with the LLM.
Sorry to hear about what is happening in your family.
I think your perspective is spot on. VUI (voice user interfaces) will absolutely change the way we interact with computers. After all, talking comes naturally to humans.
The digit divide (old people, very young people, illiterate) still exists. And will likely get bigger if VUIs don't get wide spread adoption.
> Sorry to hear about what is happening in your family.
Non-sequitur, but I cannot be the only person to find this sort performative empathy odd/out of place in this the context of HID accessiblity discussion.
For some reason I spent a few minutes trying to understand the digit divide before realizing it was a typo.<<<< digit divide ==== digital divide >>>>
I do think VUI as a concept is in its infancy and will (like it or not) both hasten and address the decline of written communication.
While I use LLMs I also consider myself an LLM skeptic in terms of its role in upending the world and delivering the value promised by the folks hyping it up most aggressively.
However, using ChatGPT voice mode and considering the impacts on accessibility, especially if that quality of interactive voice functionality is able to be integrated well into the operating systems of devices we use every day, is very exciting.
LLM-based AI is not needed, or even useful. We know how to make voice interfaces that work, and work well: have done since the 80s. It's just expensive; and it's an expense that nobody in the industry is willing to pay, therefore nobody needs to do it in order to differentiate their product.
What you're missing is that AI solves the expense problem. As the OS vendor you already have an overview and easy access to all interfaces that you expose and it's straightforward to feed that into an integrated AI agent. Add a bit of glue code here and there and a simple implementation is nearly free. Of course, the real value lies in ironing out all the edge cases, but compared to doing all of that manually, it should still be orders of magnitude cheaper.
It's not, because "ironing out all the edge-cases" is orders of magnitude more expensive than just designing a system without edge-cases in the first place. What's cheap is getting away with not bothering: but then you end up with a tech demo, rather than a usable product.
in order to cure Macular Degeneration we have to develop many different technologies that can be used for power control, it's inevitable as our history shows cyclical nature and behaviors of humans are predefined throughout the history because conceptually the same ideas and thoughts are being encoded and rehashed and decoded by newer generations.
Is this generated by AI? Also how does power control or history cycles have anything to do with curing macular degeneration?
I disagree, I think the problem with surveys like this is the focus on billing everything as "AI" instead of just what it used to be as a nice feature.
A lot of the small things that Apple has introduced, are things that blend in and are not screaming "AI" at you. Which is a good thing.
So many of the things that are being pushed by other companies are AI crap that is flashy to show off but either the utility of it is nowhere near the promise due to fundamental issues with AI, or the novelty of it wears off and is actually useless and no one cares.
Sure I wish that Apple was a bit faster with some of this, but when compared to things like Recall, rabbit, or other vaporware. They are going about it the right way, we just need to stop trying to throw around the "AI" name on everything when most users didn't have "ML" shoved down their throats for the features that they used every day powered by (fundamentally) the same tech.
The article is specifically about what Apple calls “Apple Intelligence”, not about other AI/ML-based features they have incorporated.
I think you are downplaying just how many eggs Tim Cook has put in this basket to pitch to shareholders vs. the perceived low % of apple engineering staff that seems to have been involved in its creation. ATT has been advertising "get the new iPhone with Apple Intelligence" before it was even available for sale! And now it's here, and it's a tiny bump in feature set. A super missed opportunity to rebrand/completely revamp Siri, and instead an implicit acceptance of defeat that Siri is still so dumb, you should just ask her to ask "someone else" (ChatGPT). Cook is feeling pressure to explain how Apple is going to crank out unsustainable revenue growth even after basically saturating the world (at some point you've won capitalism).
Apple has to balance it, clearly investors want to shove AI into everything but most users simply don't care or want to be reminded its "AI" every moment. Apple has to walk the line between both of these groups.
You are right about investors, but how many of those investors have also funded and continue to push for vaporware products and features? We have seen this many many times over the last couple of years. So I do think their importance needs to be downplayed since they are clearly making bad decisions.
Also, this article was about customers not investors.
Apple for years has been marking "Siri" as far more than just the voice assistant and instead encompassing many of their ML systems. For example the "Siri App Suggestions". I don't think that rebranding it would make any sense.
users have been actively propagandized by the media against ‘AI’ but will probably find it useful in the cases where they use it
Lots of negative comments on this story so far. I've enabled Apple Intelligence and don't have anything negative to say except that it's difficult to figure out where I can and can't use it when I want to. It mostly stays out of the way, and when I do notice it it's been useful, which imo is about exactly what I want in an AI product built into the OS.
My favorite feature has to be the email summarization thing. When I get inexplicably cc'd on a long thread by some clients, it's nice to have a summarization at the top so I can quickly determine whether it's something I need to deal with now or later. Others here worry about the AI hallucinating in the summary, which is a concern of mine too, but I always read the emails anyway – this is just a useful inbox triage tool (for me).
The only “benefit” of Apple Intelligence so far is that it apparently requires a lot of memory, causing Apple to bump up their minimum Mac Mini configurations to 16 GB memory. I’m loving my Mac Mini so far - but that’s entirely due to the M4 chip inside rather than any Apple Intelligence feature.
I have not used a single Intelligence feature beyond just trying them out initially and being completely underwhelmed.
You don’t find the LM based spelling and word suggestions much better than the old ones?
- [deleted]
Same, it was something I played with maybe for 10 minutes, then forgot about.
Glad my iPhone 14 Pro isn't supported.
Apple Intelligence is kind of uniquely bad. I don't understand how, but its AI message summaries often completely flip the meaning of texts I've received. I've gotten summaries that say "X person hates Y" and I read the message and it's them talking about how much they love it. It is impressively, wildly inaccurate.
It confuses things like people's names and the messages they're sending and sometimes will paraphrase names that are also english-language words into related concepts and blend them into sentences that make no sense.
The only thing it seems to do accurately is describe images that have been sent to me, though I don't really see the value in a 2 sentence summary of an image being sent (folks with limited visibility will probably feel differently.)
I have never received a summary that gave me any idea of what the hell the messages it was summarizing was about. It's impressively bad. I'm shocked apple shipped this.
It’s bad because they are using very small local models on device, like on the order of 3 billion parameters as opposed to the trillion+ param models available in the cloud. They could keep the security model and have larger, slower models for summaries (what’s the urgency of a summary?) but that would be a battery killer. It’s clear they need to train on messaging content, but they have promised not to use their users content.
It seems to me they should let the local models train on local content while charging and do reinforcement learning on their summaries as judged by a larger (private cloud) LLM.
When I first setup apple intelligence on my phone, all the “summary” features seemed worthless, so I disabled them. But somehow it was still enabled for Messages. I can’t believe how bad and unhelpful it is! To add to the confusion that showing a completely inaccurate summary sows, it wasn’t even immediately clear that these were summaries - on the notification screen I thought the actual senders were writing these bizarre things.
There is no icon or anything to indicate it’s “AI”? What a miss.
there is a little icon that is like... three lines next to an arrow in an L shape, but it's honestly unclear what it means visually (it doesn't read as AI, and I feel like "little sparkles" is the de facto AI icon) and it's also not super visible or distinct, so it's not always immediately apparent at a glance.
I think this is much more on a botched rollout and a lack of vision. Apple should have been working on this for years and have some pretty big advantages. A deeply integrated version of the best o1 chat and vision models with context from all various apps and ability to do complex app interactions could be amazing. They could even have the ability to have AI basically create on demand personalized new applications for specific use cases. They could integrate it to be a proactive AI agent regularly interacting with you regarding various goals (instead of just answering questions). Instead so far we get the most barebones largely unimpressive features like basic text summaries and toy things like image gen and they fail to have even the modest features announced in June ready for latest iPhone launch earlier this year.
This wouldn't align with what Apple is going for though. Their selling point in this category is first and foremost that your data stays private. They're looking to do as much as possible on-device, not in the cloud. This rules out models akin to o1 because they're just too large in every respect. I agree that Apple has the chance to do amazing things here, but I think it will take a while until small models are as capable as they need them to be.
I agree. This feels like the norm now a days though. An initial soft launch, iterative improvements from there. If something catches on, great! If it doesnt, less money/time spent upfront investing in it.
I agree, but based on their software track record over the last couple of years, I have little hope that we’ll get much beyond a disparate collection of mostly half-baked features.
> I have little hope that we’ll get much beyond a disparate collection of mostly half-baked features.
Really? AI has driven most of the interesting features they've released in the last decade. Notably, organization of the photos app and indexing of people, pets, places, etc. (Nb I have not yet updated to ios 18 and I understand the photos app is somewhat controversial.)
Notification and mail summaries definitely nice to have, thought I can live without that.
What I'd really love to have is Siri superpowered with contextual natural language capabilities and multi-commands in a natural sentence. It _is_ getting there: Siri is better than it used to be and has somewhat of a contextual knowledge so far (compared to zero, before the update) but it's not there yet. It sometimes fails miserably. It should get to LLM-level contextual "memory" to be acceptable IMO.
The notification summaries are OK most of the time and down right bad the rest! I'm currently working my notice period and a coworker messaged to say they where sad I was leaving the company and enjoyed working with me etc. Apple Intelligence decided the best summary of this message notification was: "News of [Me] passing received, expresses well wishes"
That’s supposed to come with 18.3/4
https://www.macrumors.com/2024/12/11/apple-intelligence-feat...
The true "intelligence" I want isn't in rewriting or summarizing emails and texts IMHO - that's actually confusing and frustrating from a product usability POV.
What I'd truly love is the ability to dictate lists to Siri while walking, driving, or doing chores, and have Siri capture them in Notes as checklists, share them with my partner, and so on. I'd also like more contextual answers from my calendar, and the ability to access and share details from my notes, all in hands-free mode.
The one feature that I love is the Reduce Interruptions focus, which I now keep on all the time. Previous approaches like simply blocking notifications for most apps doesn't work, because some critical apps (like my Uber equivalent) also send spam notifications. The Reduce Interruptions focus quite accurately manages to hide unimportant notifications without me having to block everything.
Except this isn't generative AI, this is old-school machine learning classification which PG wrote about like 20 years ago.
Here is what I want from Apple Intelligence:
Given a user story “I chipped my tooth” I say “hey siri, I just chipped my tooth, can you help me?”
I would want it to: - help me schedule an appointment with my dentist (it could create a reminder, ideally know who my dentist is, even ideally queue up or place a call)
- put that appointment on my home and work calendar
- remind me to tell my manager I need the time away
- help me reschedule any existing appointments
- compose a message to let my partner know
- tell me to try to save the broken bit for my dentist?
At best right now it may give me a Wikipedia entry or make a chipped tooth emoji.
I think Apple started with hardware that can run little LLMs and asked “what can we run on these that people would like?”
They didn’t start with a vision of what novel machine Intelligence feels like for a user, and slowly but thoroughly build a satisfying realization of that new relationship.
I want my phone to want to help me, and keep trying to get better at it. I need it to know when it fails. I want to be empowered to help it get better.
Ohh and I would like to be able to tell it not to keep featuring photos of my ex wife years after the divorce.
Yeah, maybe just remove those from your library (archive them somewhere)?
The biggest benefit I see so far is the integration of ChatGPT into Siri to answer questions for me while I'm driving instead of giving me "I'm sorry, I can't show you that while driving"
I was going to second your observation: i wonder if the language models are just better than what they had been using, because Siri has become incredibly more accurate recently, in my experience.
I was looking forward to that as well, but Siri refuses to read ChatGPT's responses to me...which makes it substantially less useful. It's just as frustrating as being told "sorry, you'll have to unlock your phone if you want me to play something on youtube music". Let's hope they iterate, I guess?
This seems largely like the new Apple, a company playing defensive/reactionary to shareholder impression, has to chase whatever the latest public craze is. During the Jobs era, it was whatever Steve thought was cool. Which was sometimes wrong as well, but it was less derivative than this new norm.
Feels a bit premature to survey this because most of the very user forward features only released last week, so are still very early in having users update.
The ones in 18.1 are significantly more in the background and restricted to the US.
I’m not saying the results themselves would change necessarily, just that the timing of the survey is poor to have any actual substance be drawn from it.
I'm deeply fatigued by this constant pushing of AI everywhere.
I now actively avoid any of the new AI features and try to block them as soon as possible. Yet they shoving it in all of my tools...notes..os..pics..chats..
There's scenarios where I do use and want the AI features like Cursor or occasionally chatgpt, but that's essentially it.
Apple has tons of user engagement data. You would think they would have a heatmap showing them where their customers are spending a lot of time trying to accomplish a task. I would focus Apple Intelligence, amongst other tools, to reduce the time needed to accomplish that task. If I were the PO, that's how I'd proceed. I'd want my customers to be able to reduce the time they're spending using the phone and yet getting more done. That open the opportunity for the customers to do new things, perhaps things you can monetize.
I imagine my heatmap would show I spend a disproportionate amount of time trying to edit urls in the browser bar.
> 10.0.0.200
Redirects to > www.10.0.0.200
Thank you safari!
Oof, too relatable! I once saw a shortcut somebody shared that would turn your URL (really any text you shared to the shortcut) into a screen-sized edit box. I didn't grab it at the time, but it shouldn't be too difficult to recreate.
I think the data would likely to show most people are spending time on short videos.
Why do you think that?
> Apple has tons of user engagement data.
I suspect they don't have as much as you'd imagine. Stronger privacy means they have less data, and the data they have is less useful, and their legal teams will restrict things the data can be used for.
I think that's why Apple Intelligence is bad. They built it on mostly synthetic data.
That's not my experience.
You are correct about the culture of privacy at Apple and that is the awesome thing about Apple. The big "Allow Apple to collect anonymous data to help improve..." screen you get when setting up your device is the master opt-out switch where you can turn all data collection off.
For those that leave it on I think Apple have become very good about finding means to acquire user engagement data in a privacy-compliant way. For starters, by identifying user data with a cohort identifier when they can (which will cluster hundreds or thousands of user's data into arbitrary groups). Or if specific user engagement is wanted in order to see if usage patterns change they might create a rotating random identifier (that is regenerated to a new random identifier every 11 months so as to not build up an identifying profile).
I think the bigger question for me is if they have the tools and teams in place that can take full advantage of that data to improve user experience. Or if it is just a firehose of data points that are left instead to just curdle in a large vat.
The most useful thing for AI on an operating system level is in my opinion aggregation and summarization of data from different apps. And somehow connecting the different apps in a smart way. Like the iOS Shortcuts app, but with dynamically creating shortcut pipelines and much more powerful.
Probably not an easy task at all.
They arrived late to the VR party, while those companies were already pivoting into AI, and late to this one as well.
I'm still waiting for Google / Alexa / Siri to be the personal secretary promised a decade ago. Siri can set a timer or simple reminders, and with Apple Intelligence can do a bit more in a more natural language, but the reliability and accuracy drops off incredibly fast once things become more complicated, to the point where they're near-useless.
I'm still disappointed by what I'd consider to be simple stuff, like hands-free operation of my phone while driving, or even just simple instructions on operating the phone without needing to touch it (let alone look at it). I'm having trouble imagining what the product owners at Apple are actually expecting it to help us with - do they use it in their daily lives?
A family member is blind, and in many ways digital assistants and AI voice control have been a godsend, and Apple Intelligence should have been enabling in so many ways. Instead it's an exercise in frustration, advertising hope where there is nope, as well as being all but hostile to blind users (Apple's is the best of a bad bunch). Arguably the switch from tactile buttons to touch screen slabs has removed more autonomy for my relative than AI has added.
It's no wonder that Apple have found that people see no value in Apple Intelligence, then, because it doesn't add any practical value.
I just won't set up Apple Intelligence, just like I've successfully not turned on Siri this whole time. They ask you if you want to turn it on again after certain major updates, but it's really pretty simple to decline. I'm glad, at least, that they don't just turn it on for you, and not let you turn it off, as some companies would.
The Writing Tools are a gimmick, but the improvements to Siri on the iPhone 16 are great. I haven't used the visual intelligence yet, but I always thought Google Lens was cool, so I hope it works just as well. Image Playground and Genmoji are pretty stupid, I have no idea why they wasted time on that.
Message/email summaries are super weird. They show up when they want to, which is barely at all.
Say what you want about this Apple Intelligence stuff, but it's nowhere near as intrusive or annoying as Gemini or Copilot. You can turn it off and ignore it.
Siri is __way__ better on iPhone 13 pro too, i've noticed. i bet they pushed better language models with software updates, but have not proven such.
Apple Intelligence is months (years?) behind on ChatGPT. It can't even deal with other languages. Useless.
I was reflecting on old technologies that were supposedly revolutionary at the time, like palm pilots, and what value they really provided. Like, I can imagine a someone at that time using a palm pilot and claiming that it's making them more "productive."
Looking back though, such technologies are nothing compared to what we have. So there's a perceived gap between our current state of technology (LLMs etc.) and the past, yet the gains in productivity don't seem to match this perception (e.g. if our tools are 1000x better, why do I feel only like 1.1x more productive. revolutionary AI tools? still, I only feel at most 1.1x more productive in life in general).
My idea is that such technologies' utility is secondary to their signaling power. Like, users seeing value in Apple Intelligence isn't the point. Rather, integrating AI into your product is a signal of competitive prowess. And using AI, even if it makes you less productive and produce worse work, signals your faith in technological progress and innovation.
The productivity improvement is negative because of how much of everything is now a "product" instead of just a tool. Whenever you open something, especially after not using it for a while, you're met with an obstacle course of in-your-face modals and popovers about new features that you don't give a shit about.
Software used to let you get your job done more efficiently. Installing a new app and learning to use it felt like discovering a superpower. Now it feels like a chore. Now it feels like computers don't serve people any more, but the other way around.
Agreed, I strongly dislike this trend of "new feature tours" on application startup. When I launch an application it's because I have a task I need to complete, and these dumb popups just step in the way and slow me down.
No surprise. The features they released aren't very helpful.
What is truly helpful is cloud-based foundational models like GPT4o, which a phone is not powerful enough to run locally and Apple does not have a good enough cloud model for.
However, over time, I think LLMs at the OS level will be very useful. It's the ultimate agentic AI platform.
I am using an iPhone 13 Mini, which will need to be pried from my cold, dead, hands.
The new "intelligence" hasn't made any difference that I can note, other than a couple of buttons, and Siri seems a bit less dense.
On my Mac, the Xcode auto-fill has improved quite a bit. It is now useful enough for me to use (but I still need to check its work, as it hallucinates weird API calls).
using an iPhone 13 Mini…The new "intelligence" hasn't made any difference that I can note
Hello, fellow cold-dead-hands iPhone Mini user: iPhone 13 doesn’t support Apple Intelligence, though, right?
That would explain the lack of difference. My Mac is a new Mini M4, though.
Aren't the newest features that are branded as "Apple Intelligence" (not their other ML stuff that's been around for a while) only available on the iPhone 15 Pro and 16 line?
https://www.apple.com/newsroom/2024/10/apple-intelligence-is...
At the bottom:
> Apple Intelligence is available on iPhone 16, iPhone 16 Plus, iPhone 16 Pro, iPhone 16 Pro Max, iPhone 15 Pro, iPhone 15 Pro Max, iPad with A17 Pro or M1 and later, and Mac with M1 and later.
Unless you're referring to it on a Mac or iPad or something.
You are correct.
My Mac is an M4, though.
The doubling-down on AI will continue until morale improves. If users don't engage with AI it's clearly because there isn't enough AI in the product yet.
Now do the same survey on Siri. I bet the result is going to be the same.
The age-old concept of "affordance" in a user interface, where a device announces its capabilities to the user, has completely gone down the drain. Now we're back to mystical buttons that claim to perform magic, and we happily ignore the tedious process of figuring out what just happened, implying that is not part of the User's Experience.
As a light Siri user since the iPhone 4s, I was looking forward to the better interpretation and context window that an LLM service typically provides.
Maybe this is not the case if "Apple Intelligence" is a different silo in the OS.
Only "intelligent" thing that happened with this Apple shift is finally giving a (minor) bump of RAM to their product line.
Apple 202x is all about hardware. It's quite nice seeing how latest FCP leverage the new hardware to improve workflow harnessing NPU. I do hope their software would catch up. it's quite strange that Siri which isn't even using basic LLM yet.
Over the past year+ I’ve developed a litmus test for whether a new AI feature is valuable or not:
If it were presented to me without using the words “AI” or “Intelligence”, would I give a s%!t about it? Can it even be described without using those words? If not, it is not valuable.
The few things I’ve tried on my iPad which supports Apple Intelligence haven’t worked so the fact that’s there’s a rollout of isolated features that is not clear to the user doesn’t help. I tried some of the stuff I saw in commercials and it failed miserably ha
I think the contrast here is interesting:
> almost half (47.6%) of iPhone users reported AI features as a ‘very’ or ‘somewhat’ important deciding factor when buying a new phone
In the context of:
> 73% of Apple Intelligence users [...] stating the new features to be either ‘not very valuable’ or they ‘add little to no value’ to their smartphone experience.
It's obviously hard/impossible/foolhardy to try and put a story behind these two figures, but it's intesting that a big chunk of people who think AI features are important in a phone, don't actually find those features add any value to them. Maybe people are under the impression that LLMs or plugins will improve significantly?
The image playground is fun, but in many ways Siri is worse. Previously I could dictate an entire command + message, and now you have to dictate the command and then wait until it switches to that modality, then sending the command.
Notification summaries are great. A couple of my children tend to message a coherent single thought across a rapid-fire set of messages, which previously was condensed down to something like "Person - 7 messages" and now is "Person wants to know what is for dinner as they are currently on a hike and want to time accordingly". It has provided excellent summaries.
Though apparently it adds some notification delay, though honestly haven't experienced that on mobile.
I asked image playground to create an image of a treeing walker coonhound, my dog's breed, as a ballerina because I thought my daughter would find it amusing. I had to scroll through 4 images to get one that didn't have 5 legs.
My experience with summaries hasn't been much better. The summary provided has been incorrect enough that I can't trust any summaries presented.
The only part of their presentation that was interesting is having Siri be able to control apps and reach into apps for information, and developers having an API to register actions and information with the LLM. I'm not super optimistic about the execution at this point though given how rushed it's all been.
I used to be able to tell Siri "Turn off the lights."
Sometimes in 2017 or 2018 they "updated" it and now it replies "Which room?" Followed by slowly listing out the 6 or 7 options including "everywhere."
I haven't been able to reprogram myself to say "Turn off all the lights" which works as before. Hearing that reply is the equivalent of hitting your toe on a wardrobe.
This update didn't fix this. The functionality is the same as before.
Basic stuff, but no way to fix it.
Similarly, stopping an alarm in another room/on another device doesn’t seem possible without Siri having to ask back about the device/room, and the user having to affirm it. That back and forth takes an unnerving couple of seconds. You can’t just say “Stop the alarm in $room/on $device”. It gets worse when the device you talk to is simultaneously playing back something, because then you first have to stop that playback in order for Siri to not misapply the “stop” command intended for the alarm.
I'm in the same boat. My phone keeps asking me if I want to update to Gemini, but I won't because Gemini just gives my typical LLM BS responses when I just want to do a basic Google search with my voice. Gemini just complicates my already complicated life, so I don't want it.
People on 9 to 5 mac website do not want their jobs to be automated, therefore there's a certain pushback against automation. This creates a great opportunity for those who're willing to use this arbitrage
AI, especially LLMs, needs to move beyond being a "checkbox feature" and focus on solving real problems effectively. For example, as some users noted, personalized AI capabilities—such as context-aware assistance for accessibility or seamless integration across devices—have the potential to revolutionize user experience. However, the current implementation often feels more like a marketing exercise than a thoughtful enhancement. Until AI tools provide consistent, tangible value, skepticism from users is likely to persist
IMO the most useful features are the summarizations – something I'm not sure the average user even recognizes as "AI".
Image/emoji generation isn't really useful but I expect the photo editing capabilities will get a lot better over the next few years. The camera is one of the biggest reasons people buy a new iPhone but most consumers don't realize how much computational photography is involved.
I can definitely see the groundwork for more compelling features.
You only need it to summarise badly once to stop finding it useful though. There was a BBC notification that it summarised as "Luigi Mangione shoots himself" a few days ago.
I'm weirdly happy with Apple taking their time. Seeing AI stuffed into every nook and crevice of various non-Apple apps has made me appreciate apps that use AI tastefully or not at all (counter example: today I tried the LLM built into GitHub's feed page - to describe it as a 'toy' or 'trash' is not overstating how embarrassingly useless it is!). Better not to have junk features, or ones that use AI for AI's sake.
But it seems that the whole Apple Intelligence initiative is impacting iOS 19 [1], which to me makes Apple Intelligence detrimental to the whole OS. Some of the features are nice (smarter Siri, notification summary) others will probably be left to rot like the whole Animoji thing.
[1] https://www.moneycontrol.com/europe/?url=https://www.moneyco...
The only tool I appreciate is the text proofreading option. The summaries are ok but I could live without them. The rest is no good to me. I think I’ve been trained for years to avoid Siri (because how embarrassingly bad it is) that I just never even think to “ask” my phone for anything.
I had to disable it on my 16 Pro the other day.
My phone was freezing or rebooting at least once per day and on a whim I disabled Intelligence. Since then I've had zero problems.
I wasn't even using it much. So whatever it does in the background must have a memory leak or something.
> the survey recorded 73% of Apple Intelligence users ... stating the new features to be either ‘not very valuable’ or they ‘add little to no value’ to their smartphone experience.
They make it sound negative, but I'm lacking context on how it compares to other recent features. I actually would assume that with 27% of users finding a feature to be somewhat valuable, it's a relatively successful launch.
someone else made a similar comment already, and I just don't understand why you default to 27% liking it. Of those responding, less than half said that AI was even a consideration in their new phone purchase. So we already see more people do not care about AI from the off. Of those reporting using the feature, it says 73% state they don't care about it after using. That does not mean the other 27% said they did like it. It is not a binary option of Yes/No. The numbers are just not there in TFA, and you're jumping to conclusions
I didn't imply a yes/no; based on TFA, I assume that they used a 5-point scale, with the middle option being phrased something like "somewhat valuable", and there being two more positive ones. TFA is critically lacking in both the survey details and what other features got on similar surveys.
I think it does calculations when I type the = sign.
That’s occasionally been convenient.
But that’s more than balanced out by the couple of times it’s rewritten the words I’ve typed to something I absolutely did not intend on saying.
And the first time I ended up actually sending the message and it caused a whole lot of confusion. I would have thoughts it’s auto correct except AI changed a couple of words and the spellings weren’t similar.
It drains the phone battery so fast it's unusable as is.
The delay that Apple Intelligence introduces into notifications is also unacceptable. On both desktop and iOS, notifications are _very_ late - often more than a second on my M2 Max laptop - which ends up being quite distracting.
Slightly off-topic because it’s for MacOS, but the new Image Playground has been fantastic for generating assets while I’ve been prototyping a new game.
Image Playground is available for iPhones too. I haven't used it on the Mac yet. But I've send people their AI generated images and everyone's been pretty impressed by it
I guess the point is: even fantastic AI products are often not useful to the majority of people.
I’ve found the ai summarizer helpful for parsing my boyfriend’s drunk and or stoned emo texting.
All I have from "Apple Inteligence" now is the super annoying extra delay when texting an image.
Unlike other areas they don’t seem to have found any sort of way to apply apple special sauce.
If you’re just doing the same as everyone else then you can’t justify any premium.
Maybe they’ll pull something out of a hat but think if they had something they’d have done it already
I tried image playground on mac and can't think of why I'd want to use these corny looking images. It downloaded a ~5GB model and doesn't offer the option to remove it. Anyone know where it's stored so I can delete it?
Not surprising. This is an early alpha phase to be fair, where there isn't enough integration to be wowed. The features are basic, and if you use other Apple apps, you get some help, but it's not as deeply integrated as you would like it.
I think on phones, real value comes from intuitive actions which are integrated deeply. And peripheral vs having to be prompted everytime. Eg: A very simple usecase is just knowing when to disturb via notifications. (eg: disturb me when I need to intervene where something is blowing up in a group chat I have muted. Or automations and insights that come from aggregation. Eg: I get notifications about a spend - smartly put them in a sheet somewhere that i can look at later. Translation, transcription, image editing etc. are all already covered. Those are more nice to haves and not essential at the moment.
Makes it worse that it consumes space on your phone you can’t remove
I dunno I get a giggle out of the non intentionally satirical news summaries
Personally I think it’s bad to use llms where people need to have trust or don’t expect non exactness, they are handy for some things though
The ONLY Apple AI feature that was even remotely interesting to me was the photo clean up. Works pretty good for removing objects from a photo, which would be hard to do on a phone otherwise without opening in another app or messing with it on a desktop application
Android has had that for ages, and its cool to play with maybe twice and then I kind of forgot about it. It just turns out I dont take a lot of pictures that need cleaning up.
I enabled it for awhile, but turned it off.
My problem, as usual, is trust.
Summarizing a website, news article, or email? I can't trust that the summary doesn't contain hallucinated information.
The only use case I find it good for is coding. And even in that case it’s more of getting from 0 to 50% quickly and I have to make corrections and finish the last 50%. I have not had a single other use case that it seems well suited for.
What product are you using that has Apple Intelligence built into it?
One major contributing factor: it is in beta and not enabled for most iPhone owners!
When Apple announced Apple Intelligence I was very curious about how they would go about it. I expected a very innovative and elegant implementation but so far it feels very clunky. And the results aren't very good. ChatGPT is way better.
Putting it at front and center of every sales presentation and then rolling it out slowly in small steps didn't help either. Siri still sucks.
I had no idea that any Apple Intelligence features were enabled at all.
The E-mail categorization is so bad that it's totally useless. Most of my E-mails are categorized wrong. I can't see how anyone thought it was a good idea to deploy this.
$3 trillion dollar company. What are they doing with the money? I'm baffled they could still be putting out anything less than exceptional quality.
Money and quality are rarely correlated.
I was considering enabling it, and I could see the features being useful for some people, but not for me.
For example, notification summaries. If someone has apps blasting them with notifications, summarizing them into fewer more meaningful ones could be truly helpful. But for me, I’ve carefully managed my notifications. I get almost none. Every single one that is enabled is important and actionable. I don’t need to summarize what’s already small.
The writing help is another example. Maybe a person is a slow typist. Maybe they are losing a lot of time thinking of what to write, or how to word something. Maybe they are writing in a language in which they are less than perfectly fluent. In those cases the Apple Intelligence can be truly helpful. None of those are me, so I disabled it.
Agreed. Different for different people. Notification summaries are a lifesaver for me.
Work MS teams but no copilot, lots of group chats about lower environment issues or prod issues that move super fast. The summaries have legit saved me 10 min of getting up to speed, and let me decide quickly if I need to intervene or keep working my separate workstream.
The only downside for me - it keeps the summaries VERY short. Another 5-8 words as a in between notification and full app open would be great for me. (But, I acknowledge I’m likely an edge case)
Notification summaries is an interesting one, feels like an email spam filter. I've always been of the mind, though, that if I'm getting too much spam I need a new email address rather than a better filter (or in this case, better notification limits/settings).
I wouldn’t know because Apple doesn’t want iPhone 13 Pro Max users having access to their AI functionality. So my 18.2 update is neutered compared to someone’s running the latest device.
This is actually for lack of RAM reasons - 15 pro and above have 8gb RAM which is enough to fit 3-4gb models locally with room to spare for rest of the applications. The 13 pro max has 6gb of RAM which is not enough.
Not all Apple Intelligence is run locally. They could have added all the API-based stuff to older phones, but they didn’t
Oh that makes sense, thanks for explaining!
Biggest disappointment with Apple Intelligence is that it doesn't seem to be able to do anything with the apps I actually use. Can't say "find this email" because I am using the gmail app instead of Apple's inbuilt email app. Can't say "send blah blah blah to Bob Smith on Teams" because Teams doesn't integrate at all.
Without integrations into any apps smart assistant isn't very smart.
Different assistant, but I'm still astounded that my Fire TV can't understand the words "play"/"pause"/"unpause"/"continue" consistently across all apps despite having a dedicated button on the remote for doing just that. The bar is low.
You can take out the “Apple” and it would still be true.
I'm new to iOS - just switched from Android about 10 days ago. I didn't move for Apple Intelligence, but it's a feature I've played with.
- Writing tools are neat, but I don't have much use for them. They worked pretty alright.
- Image playground is neat, but I haven't and probably won't use it outside of goofing around and showing a buddy.
- The new Siri is awful. I don't know the old Siri very well, but it's just not good. I had to ask it something like five times what time one of yesterday's football games was going to be.
- Visual intelligence isn't even a thing. I can't believe they're advertising it like it's interesting. It's just asking ChatGPT about an image (which the ChatGPT app already supports) or using Google Lens (which the Google app supports).
What I am impressed by is how much gets run on device. About half of the writing tools are fully on device and the image playground was running entirely on device. That was very neat to me. Tested by turning of all data connection and seeing what was unavailable (ran into this with writing tools).
I'm still super hopeful for Siri to use my personal context. I think that one is set for March 2025 in the US. If they can pull that off, that will be the most valuable personal use case of AI for me so far. If I can just ask "When does my flight leave?" and it can immediately know I'm talking about my flight next week to visit family for the holidays and either use the email or my calendar to pull it, that's just incredibly neat to me.
Other than that, I've been enjoying the iPhone and some of their other machine learning features that have been around before Apple Intelligence. Searching in the Photos app is blowing my mind. I think Google Photos would do that too if I had backed up my images, but have this on device on the iPhone is just insane. I can't believe all of y'all have had this for years.
Have you played with Shortcuts yet? That's the one iOS (and I suposed iPadOS and macOS) feature that simultaneously makes me very happy and infuriates me. It's so satisfying to get something working, but it's usually a pretty frustrating experience getting there.
I'm so interesting in Shortcuts and I love the idea of them, but I haven't found much of a personal use for them yet outside of a shortcut to a specific Google Sheet.
That said, I love that Apple has made this a first party thing. Tasker on Android was a bit finicky (not a knock - it's gotta be hard not being the platform owner and dealing with so much different hardware).
I want to have a use for them so badly! Everything I would've made a shortcut for back in my original iOS days (ended in 2014) are things they've added to the OS directly, so I really can't complain too much.
- [deleted]
The image generator in 18.2 is... honestly kinda pitiful. It doesn't seem to be able to achieve images that match two-sentence descriptions.
I thought Apple Intelligence would make Siri smarter but I have been extremely disappointed. Apple Intelligence is useless and makes Apple look stupid. Siri is still less than 50% useful and the only time it actually works is when I ask it to create a timer or an alarm. It's otherwise useless.
My kids and I have a game while I'm driving where I will ask Siri a perfectly reasonable question and we will guess whether or not it will answer it reasonably. I don't understand how it can be so bad compared to everything else out there.
iOS 18.2 and macOS Sequoia 15.2 were released five (5) days ago. You need an iPhone 15 Pro or better for full ApIn support. It's a touch early, friends.
[dead]
Of course they see no value in it, because most of them don’t even know they’re using it.
Apple don’t usually market these features, but they had to this time round.
- [deleted]
LLMs can do useful things… but they seem to be getting shoehorned into tasks that they’re not good at.
- [deleted]
The AI I have found the most value from is the OCR for copy paste which has been availble in Samsung phones for years. Everything else is a gimick. VoiceAI only wows you once and gets tiring very quickly.
It's legit bad, with an awful onboarding experience.
Opened Image Playground on the iPhone, every prompt error'ed out. Deleted the app.
5min later I get a notification - Image Playground is now operational! Like what the heck!?
Also turned off every summary function. I need to see the latest update on an email thread, not the start. Whoever designed this one...omg.
Gemini on android is miles ahead it seems
Chatting with Gemini is very good. I notice Google have realised this and are advertising it a lot.
It seemed to be buried before.
Recently upgraded my iPhone 11 to 16 (mostly because battery was not great anymore). I've turned off all of the gen AI stuff and enjoy the longer battery life and a better camera. I'm not an gen AI skeptic - I use it in other contexts (e.g. CoPilot, chatGPT). I just don't find Apple's current AI experiments that appealing really (e.g. emojis?? really?). I guess gradually they'll find the natural fit of these technologies in places where they actually bring value. At the moment it just feels like a salespitch to justify the over spec-ed phones we keep buying.
The Message suggestions are insultingly awful. And I can't find a way to turn them off.
I was excited for the Genmoji, but they seem to be highly censorious and in general just have no idea how to represent many concepts as emoji.
The image playgrounds are obviously a nightmare that should never have seen the light of day.
The summaries are nice.
I can’t even get iPhone 15pro max insane. But don’t care. Apple is useless and Siri is SO pathetic that Tim Cook should be literally fired by the board for gross incompetence. It can’t even set a timer on the latest flagship device
Are you saying you can’t have apple intelligence on your 15 pro max? Yes you can - it’s one of the supported models.
I’m not sure what difficulty you’re having with timers. It’s quite easy to do multiple timers: “hey siri, timer one hour”, or “hey siri, start a pasta timer for 9 minutes”, or “hey siri how long is left on the pasta timer?”
I use it all the time.
Timers are on the clock app, or just ask Siri "set a timer for X minutes".
When I was an Apple intern, Tim told us that technology only has an impact when it changes behaviors. Otherwise it’s a “gimmick.”
I wonder which behaviors Tim has changed from using Apple Intelligence?
I don’t know what it does except Siri seems to be able to answer complex questions on the level that Google assistant could at least ten years ago. Can anyone tell me what I’m supposed to be doing with this?
I use Claude/ChatGPT a lot. I turned Apple Intelligence off. It was summarizing incoming text messages instead of showing me what people said at a quick glance.
Also, I don't really e-mail off my phone. If I'm going to write an e-mail, I would want to just write it and be done with it. To write it, have AI mangle it, then need to re-work it sounds like more work.
It'd be cool if you could tell me "show me all of my photos with ____" (I know that exists today, speaking of more advanced) but that's about all I can think of
I imaged an Alien species having a headline on the condition of our planet that read :
“Humans see little to no value in intelligence so far…”
100%. I'm still on the prior rev of both MacOS and iOS because I don't see the upside to upgrading.
most AI products haven't been integrated to the point where they can actually be useful (largely because they still completely fuck things up and no one trusts them) — I don't want AI to write words for me, I want it to perform the middling multi-app tasks for me.
Not "write this text for me" but "put together a kids music playlist and start it when I connect to carplay" or "take the last few photos of the kids and send them to my mom's photo frame", "respond to all political texts from unknown numbers with STOP", "write a bash script that does XYZ so I don't have to manually workaround this dodgy hard drive issue every time my computer restarts"
the promise was personal assistants, but they're still mostly chat bots
- [deleted]
Please Apple, add an option to let Siri transfer my requests over the internet to handle them with an actually intelligent helper. Currently, Siri is only good for setting timers.
The new features in 18.2 GA release are a total shit-show. I've never seen Apple software this rushed out the door before, probably to beat the end-of-year shutdown.
Image Playground is full of so many bugs and weird UX patterns.
- The app is called "Playgrounds". Weird, especially considering Apple already has an app called "Playgrounds" (full name: Swift Playgrounds). Fine, weird, and especially weird if you put yourself in the shoes of someone who has no idea what AI is or what Apple is building.
- When you launch it, it has to download models, a process which took over an hour on my internet connection, but it gives you no indication the models are downloading except one informational popup when you first launch the app; nor the download's progress.
- Instead, I was placed into the full app experience, where I was allowed to try to generate images. The generations would fail with an error.
- However, the error is totally unreadable because its presented as a toast at the top of the screen, rendering underneath the dynamic island. All I could see was the first few and last few characters.
- I've heard confusion from two friends (techy-bubble of course) on "why can I only generate variations on these four friends" Its because the list of people you can generate images of seems to come from the list of facial recognition targets in Apple Photos. Yup. None of my friends who have tried this pay for Apple Photos, so their photos aren't being backed up, their list of generation targets are only those in recently taken images.
- You can upload arbitrary images to the model. But its... well, it doesn't do anything predictable. A picture of a bookshelf imagegens a piano. A picture of a person complains with an error clipping the dynamic island "choose a photo with the face more in view". A picture of a black SUV imagegens a yellow jeep. The immediate sense I got is that they're feeding the uploaded image into a "describe this image" LLM, summarizing it, then feeding that back into the imagegen model.
- There is one additional option in this menu, labeled "Appearance". I dare you to click that, put yourself in the shoes of someone who doesn't browse HackerNews every day, and try to understand why that's there and what it does. I think its presenting a way to generate a generic AI generated human, without a source real image? You get the choice of skin tone, and then some kind of ever-changing collage of Vibes of the person you want generated? I can't even explain what's going on because even I'm confused, its incomprehensible.
- The share sheet breaks like 20% of the time. On one occasion this break crashed the app with a popup that displayed a stack trace.
- We were told we'd be able to generate images in three unique styles: Sketch, Animation, and Illustration. Only two of these are available (Sketch is absent) [1].
Its really bad, even when it works the images it generates are pretty trash.
Technologies are starting to feel like javascript libraries. Every year there is a new one and the next year nobody cares about it anymore.
I don't understand a lot about AI or quantum computers, but I'm 50% sure in the future we'll be sold QPU's to use AI or something like that.