From “Hey, You Assouline” to “This Software Is So Good I Forget I’m Using It”
Why Should We Be Forced Ignore Small Yet Fundamental Errors in AI? Our Judgement Is Better Than That.
I’ve changed the names to protect privacy, but for those involved they would instantly know from the title who wrote this article, remember our project, and would laugh. Assouline is not a typo.
“Hey you Assouline, it’s not working”. She hung up.
It was Lori, my business sponsor in Leasing. I was heading up the team of software developers building out the software for the Bonus System that would help Lori finally tame a process that took 5 weeks 2 times a year to distribute bonuses for the sales personnel. She and Julia were acceptance testing the new system, and up until today this was the first major bug. But it was a critical bug. Acceptance testing is a process in software development where the people using the software ensure that they can verify it does what it’s supposed to do. That it works. And in our case, that the system paid leasing agents correctly based on their negotiations with our tenants. There was a lot riding on the project.
Automation and software is about responsibly ensuring a set of steps are repeated reliably. Just because something is performed without your shepherding or controlling each step doesn’t absolve you of the responsibility of knowing that the process does what it should for whom it should.
I have been performing my own acceptance testing on Google’s latest AI model called Gemini. There are impressive aspects, but there are fundamental failings. They may appear small, but the implications are concerning. Yet we hear that AI is the solution, for nearly everything. And we’re told we don’t understand AI. Yesterday I was told I didn’t understand the proper use of AI and to quiet whining.
Would you use a spreadsheet that intermittently gave you a wrong answer, but you couldn’t tell why? Why should we just accept that AI becomes our arbiter of truth, and what facets are used to judge that it’s superior to our own judgement?
Without acceptance testing - showing that the stuff works and you can trace the steps reliably - is it responsible to trust AI? Finnish company Klarna claims that ChatGPT AI performs the work of 700 customer service agents. At what level of quality it’s uncertain. But it’s vital to understand if we are expected to abdicate responsibility and let AI steer the ship.
If you haven’t ever worked on a technology project, I’m going to tell you of a fun adventure that taught me a lot, and it will give you insight into what is being overlooked when you see the cheerleaders for AI claim it will eliminate the need for humans to perform work. The lessons from my history in software will help you judge the effectiveness of AI and software if you are faced with the decision to use AI. That will help you pushback on blind acceptance. Then we’ll jump into what I’ve experienced with Google’s Gemini.
In the end, AI is software, software is a tool, and when it fails to meet simple yet fundamental criteria, we shouldn’t be told to stop whining and that we don’t understand because most facets of AI are just so cool, it does so much we have to admire it anyway. We can’t be rushed into accepting certain aspects of AI that we wouldn’t put up with in earlier eras.
We’ll also learn why I could be an Assouline.
How Do You Make Software, Assouline
Assouline was a boutique at Crystal City in Las Vegas, a commercial retail property managed by the company where I worked. The property was high end retail, and at the time Crystal City made $1400 per square foot in sales. I’m a dumb guy. I wrote software, was skeptical of everything, business included, and by sheer dumb luck had ended up working at a wonderful Real Estate Investment Trust full of some really crazy people like myself. Crazy thing is, my company managed malls and I hated shopping. My life is full of those types of irony.
I learned more there than anywhere else. I was warned by my boss at my former company that I wouldn’t be challenged there - he was very wrong. I was told by colleagues that I needed to stay with software developers who would be better mentors, and they were very wrong.
Assouline was Lori’s and my joke, because like an idiot I read that name aloud at a meeting not knowing how to pronounce it. “Ass-oh-line?” was how it popped out of my mouth. I’m used to being laughed at and not caring for that type of blunder. I run the ball that way, and manage to say dumb stuff like that. That’s how I am.
The current process that my system would replace was using 80 linked spreadsheets no one could touch because if you moved something it would come crashing down like a house of cards. The current chaos was responsible for more disputes between our leasing agents and management. It affected people’s pay. I am changing names here for people’s privacy, but Lori will know it’s me by reading the title. And yes, I am an Assouline at times.
Software is ultimately about communication and crafting a vision and then realizing that vision. That actualization process is based on psychology. Working with Lori taught me more about that process than any other opportunity I had. I have been lucky to have done many interesting software projects since then, but the foundation of those successes was based on what I learned with Lori. She did not abdicate her responsibility ensuring that the processes we automated worked reliably. She made sure she could perform her task more efficiently with what we would build for her.
Lori was not a techie. At all. “You guys drive me nuts with all your terms,” she said to me at our first meeting. “I’m used to using spreadsheets for processing the bonuses - it’s a super headache but it’s the headache I know. But I’m not confident we can find something that will replace the way we do it now. There’s a lot of money riding on this, and we’re under pressure - a LOT of pressure - to get the leasing agents to sell aspects of the leases and tie their pay to reaching those goals. But I’m really nervous about this.”
Lori was brutal and honest. She needed to be because she was in a position that affected a lot of people at our company.
I came to the REIT industry from consulting at PWC, and there we were trained that you find the best practice tools and bring them to your “customer”, run demos of the best features, and after a vendor bake off process, you pick the best fit. This, in theory, saves you time, because you can select from proven industry standards. I knew that there was no solution we could buy that would do what we needed already, but we could easily get a software library that would make the screens super efficient, and the software that made these user interfaces could be re-used on other projects. This package was the best rated product in the industry, and as a manager I could approve the purchase myself.
My team took two weeks to create prototypes while we interviewed Lori, Julia and their assistants to understand the business rules for processing the lease bonuses. There were 90 different combinations of incentives, each with its own calculation. In addition, each 6 months new incentives were created, so the system had to be easily updated, and not by my team in IT, but by Lori in Leasing Administration. So we needed the flexibility of spreadsheets without the nightmare of “who moved my spreadsheet - the other 79 now don’t work”. In 2009 this was a challenge due to budget constraints and to technology. The trend was to buy big software systems, but those were actually double the cost because they didn’t fit your business practices that made your business unique. But we loved a challenge.
We were super pumped with our demo, and invited Lori to a meeting.
She hated it.
“I … I … This is confusing. What’s that thing?” She tapped a funnel icon on the screen. “Like what was it?”
”That’s a funnel - you click that in order to filter down the list,” I said.
”Wait, when you what? Here’s your problem. You nerds get all excited over these little … what do you call them, icons? Yeah, well you get excited over these icons but the screen’s are not simple. At all. I can’t use it. I can’t train anyone to use this. It’s way too hard.”
”Ok,” was all I could say. This wasn’t going to go well.
”You know what your problem is, you always make me take the elevator all the way to the bottom floor.”
You always make me take the elevator all the way to the bottom floor?
Lori was a bit annoyed, and now I was.
”What does that even mean,” I asked abruptly.
”This thing.” She pointed to the scroll bar on the far right. “What do you call this thing? You make me go up and down and up and down. I get dizzy. Don’t make me go to the bottom of the screen, then all the way to the top, it’s a waste of energy. I can’t concentrate with all that bouncing around.”
Lori’s Elevator - who wants to scroll all over the screen to get what you need?
I could see the looks of frustration on my team’s faces. Leasing and Lease Administration were lovingly called “The Nuts At the Wheel”.
But Lori was right. She had a great point in that Leasing’s strengths weren’t computers, and it shouldn’t be. It should be sales. And anything that broke that context and focus would derail their efforts. To her, the scroll bar was an elevator. So she got the term wrong. So what?
Most techies think that this is just someone who isn’t “trained”, or not technical. That’s a huge mistake. Then I realized that if she could work through the screens efficiently, in the way that matched her style of constant interruption, her job would be easier. Managing leasing agents is like herding cats, and if getting the right information was too hard, she couldn’t do her job.
Or fulfill her responsibility.
Lori’s vocab was different, so instead of me being the “consultant” and superior technologist with an ego and telling her what words to use, I had to think like Lori. Lori was a manager, so she spent more time using computers than the leasing agents. The VPs used computers even less, and when they did it was with email. We had people who were GREAT sales people who had been at the company 20 years, and it was silly to expect them to become tech nerds. It would wreck their productivity as sales people.
”Why can’t I have something where there is a box, I type in it like Google and it finds the leasing deals? So I want to see bonuses in the first level of manager approval. Why can’t I just type “manager” and just that appears? Like Google. And easy like email.”
Simple criteria. And we were going to give her a complex set of screens. But the lesson here was that our judgement just sucked. Lori, despite calling the scroll bar “the elevator”, had far better judgement than I, or my team.
My point with this is building software requires a different type of judgement than what the software developer generally possesses. The person in the seat, working day-in and day-out, has better judgement on how the software meets their NEEDS. No matter how weirdly they describe that acceptance criteria, I learned to set aside my own vocabulary and had to learn how to think like Lori.
If the tool doesn’t meet the fundamental need, it’s no good. If the tool can’t perform as expected, it’s no good. If it’s inconsistent, it’s no good.
So I tossed out the PriceWaterhouseCoopers rule book. One of my developers found something that did exactly what Lori wanted, it was open source, meaning that the originator of the software placed the code in the public domain, it was well tested, and as long as we committed the license restrictions, it was free to use. We adapted it to our needs. Big win.
We worked our asses off. And while I’m on those adventures, funny things happen. Like Ass-oh-line, who was a tenant at Crystal City in Las Vegas. After I butchered the name in front of Lori and the others, we chose Assouline as the main component of our testing data. Yes when I screwed up I’d get a phone call “Hey you Assouline …”
And we built something that worked so well, it rocked by Lori’s standards. It wasn’t awesome because of my vision, it worked because of Lori’s vision, and her common sense. She went from taking 5 weeks to 8 days processing the bonuses. Huge time savings, and super organized. The VPs could get a report in 3 clicks. No long “elevator rides”. The leasing agents could identify for themselves the economics of the leasing deals the company and more importantly, for themselves with this tool. Upper management could easily tell what incentives were being offered and achieved. New incentive programs could be introduced easily, without my team of software developers going through a “Hollywood style premiere”. With a click you could go and look at a pdf of the actual lease.
She wrote in her review that she sent to our VP “This software is so good, I forget I am using it. It’s up all the time, I can get in without even thinking. I use it as much as email. I can’t work without it”.
This would not have happened if it didn’t meet its fundamental goals. Fancy screens and complex reports would have been immaterial if the system was not sound with fundamentals. We didn’t make Lori take the elevator all the way to the bottom of the screen, and back up. She could fulfill her responsibility because she could use the system easily, like second nature. “I forget I am using the software”. Our dumb ideas didn’t get in her way. She had a tool that she could use to fulfill her responsibility.
And this is where AI can fail. We chose to ignore that for fancy dreams.
The Twin Personalities of Gemini 2.0
Here’s a simple but HUGE failing with AI from Google. It’s enough to make me question how much more productive you are with it. I need to be fair and say that there are a lot of very impressive capabilities with Google Gemini. Impressive enough that if they can correct some of what I am going to describe it could be a fantastic tool. It has voice recognition capabilities that can be just like on Star Trek and interact with you while you work. I’m saving that for future analysis and writing.
But AI is software, and is a tool, and that is only as good as the judgement of those who use it.
Generative AI means that AI will generate or create answers or results in the form of written text, software code, images and video. I want to avoid anthropomorphizing AI, so bear with me. Generative AI is good at pattern recognition, and then using that pattern with slight variations, and it excels at parsing large datasets to identify patterns, and variations of patterns. You will also hear the term Large Language Model (LLM) in association with Generative AI. A LLM is a more specific type of Generative AI that analyzes language and grammar and works with text to produce summaries or translation.
But unlike a database where a programmer creates the structure of how the information is stored, AI LLMs can take the jumble of different types of information, and form categories as it goes.
But the nasty secret of AI is that - and I’m using the industry phrase, it’s not a good one - it hallucinates. Now I’m going to be like Lori, use my own judgement and not adopt the vocab that the AI Guys like.
Here’s the deal: AI gets things wrong.
It’s inconsistent. And while it is wrong, it will be very eloquent and authoritatively state something is a fact. When you point it out that it is wrong and ask it to correct the result, it may tell it’s fixed, but the answer is still wrong.
This happened to me with my experimentation with the latest Google release.
My goal was to test a field of knowledge that was not dynamic and would basically consist of finding facts, associating concepts and providing the sources for the answers. After all, AI is being touted as being better than humans and should be viewed as a quasi guardian of truth. The thinking is that since AI can analyze vast amounts of information and match patterns, it should excel at providing facts as well as identifying falsehoods. In a podcast last year I relayed the use of ChatGPT in “re-educating” adults who had notions that were considered conspiratorial.
In AI you issue commands with what is called a prompt. Since AI adapts to what commands are supplied, you can set overall constraints at the beginning of a session to maintain quality and consistency of the answers. You can also direct AI to format the answers with tables, charts, text output that can be imported into spreadsheets. In this respect it’s a huge time saver.
From my test I set the following system parameters:
Every time you give me an answer, AI, I want to see your reference and I want the URL of the source so I can go and verify myself.
I asked Gemini about the proponents and theorists of Natural Law, and instructed Gemini to build a timeline of the evolution of the thinking that formed that basis of Natural Law. Since this is established history, there shouldn’t be an impact with a training date cut off for Gemini. The corpus of information that Gemini was trained with included Internet sources from June 1st 2024 and earlier, as well as journals, books and transcripts from countless digitized libraries.
While Gemini created authoritative answers that appear to be correct, I decided to click some of the links. Here is a link for the answer Gemini supplied for John Locke.
These are all the references from the Gemini’s answer.
Here are the results. All three are incorrect.
I then asked Gemini to correct the situation. Remember, the overall constraint was that I wanted actual working URLS from the information it used to formulate the answers for me.
It failed to accomplish that.
Interestingly enough, Gemini warned me that I may have “unsafe content”. I didn’t realize that the Enlightenment and discussion of Natural Law and punishment was unsafe.
I was warned that the MAY be a possibility of unsafe content.
Stepping Into An Empty Elevator Shaft
We are being rushed into accepting certain aspects of AI that we wouldn’t put up with in earlier eras. The acceptance criteria for success, such as defining and meeting expectations, where we are the judges with standards, is still vital today, and should not be cast aside because the technorati are all excited about the promise of technology. I say promise because there is a lot of hype that is used to sell aspects that may come to fruition, but people will lose their jobs over. And sadly there are a lot of foolish people who shallowly judge the validity of something based on fancy vocabulary alone. They would dismiss Lori’s “taking the elevator to the bottom of the screen”. For them, the magic is in the promise, not in common sense.
Google Gemini produces a lot of text. Very rapidly. But not all the results are correct. That lends to the illusion that it’s superior. I couldn’t pump out answers like it can. But pumping out incorrect answers rapidly is scary. What is the phrase, “a falsehood can travel halfway around the world before Truth gets it boots on”?
But the fervor Tech Bros have is dangerous because AI will be used as an excuse to fire people in the name of efficiency, and we may not get what we think we’re being promised. OpenAI, the creators of ChatGPT, is offering what they claim is PhD level expertise for $2000 a month. Will the underlying fundamentals be checked, or will it be assumed that you are getting a PhD that is nearly inerrant because of how quickly it can provide eloquent text? [Get Link]
I’m looking at you, DOGE Bros. We are very close to abdicating our responsibility and turning it over to people who believe in the inerrancy of automation and software, and are clearly unworried about the misapplication of AI. And are unwilling to examine if the underlying fundamentals are functioning, or who brush our concerns aside and say we don’t get it.
Don’t let them be Assoulines, and treat us like Assoulines.
Another wonderful insightful article, the message is clear. Just one question, I said Assouline just like you did, how is it really pronounced. AI scares me, maybe not now, but 10 yrs from now. How do we help younger generations to control it and not the other way around? Humans are lazy. Our H1B visa issues show that. The countries that were the hungriest are now becoming the fattest as America loses it population to disease, minus zero population growth and literal alien invasions. Failed education systems, self medication depressed students who cannot read, know they will not be employable so why have the student loans , the focus on needs for equality and validation by a mentally ill population that somehow is making laws that are creating and perpetuating continued mentally ill populations.
We are ripe for the taking, starting with dual states of Michifornia.
Trump with a Scarlett F will now be weaponized against any and all parties except the Democrat party. They may have successfully gotten their Uni-party.
Until we dismantle the false narrative around Trump, this is going to be a very short 4 yrs. AI prompts are totally in favor of the Leftist narrative, even Grok.
As I have posted 20 times today, if Zuckerberg seeks absolution, this is the best place for him to START. Start telling the truth about this. The Law Fare, the Witch Hunt, The way information was suppressed to shape a narrative that is believed by enough people to easily win the next election. If it comes from any other source, it will be seen as contrived. Zuckerberg and Assoc. have to dismantle Legacy Media Is he brave enough? Will Soros let him? He owns a monopoly on media operations.
Owning 2/3 of radio stations that he is able to control legacy media through the back door is exactly what the 1996 FCC Act was supposed to prevent. But the FCC fast tracked it instead.
I have been know to be cynical, but my experience has taught me to be. Biden passed legislation to protect illegals today, yesterday he gave more money to the Ukraine. Today a lower court was allowed to forever stigmatize Trump, the one person who what the guts to stand up to the largest organized crime syndicate in the world. The FCC, FAA, DNR, FBI, CIA, DOJ, DOD, .. its a rabbit hole and the power of AI to lead us down that hole scares the shit out of me
Will DOGE, Can DOGE use their power and influence to expose this beyond any doubt? If AI is already, with any prompt, presenting Trump as the person MSNBC and the VIEW say he is, won’t any change been viewed as manipulation, inauthentic? It has to come from Tech Valley and the Zuckerberg/Google/Bezos crew. Will it be enough?
Trump is the most difficult person in history to re brand. I would say he is not rebrandable. There is a new word.
Indeed, what will the DOGE Bros do, what are they willing to do? Are they willing trim their own fat?
I guess we will see