top of page

According to Gartner in August 2023, "Generative AI has almost immediately reached the Gartner Hypecycle's Peak of Inflated Expectations" with OpenAI and ChatGPT reaching 100M users within  60 days.  According to Forbes, as of September of that same year, challenges with LLM's were becoming apparent as researchers and developers started to apply the technology to concrete domains and industry Use Cases.   

hype-cycle-for-artificial-intelligence-2023.png

This signaled descent into Gartner's Trough of Disillusionment.  This is signified by interest waning as experiments and implementations fail to deliver, and producers of the technology shake out.  This is exciting, because rapidly traversing the cycle separates snake oil from true innovation.
 


When you realize that Generative AI and LLM's create a form of a "meaning space", effectively an object model that leverages the semantic of language and the finite words thereof to access various relations, one realizes that what has been invented is akin to the introduction of the transistor.  What followed in that discontinuous productivity jump was that various forms of integrated circuit followed; in essence, integration and application is what needs to follow this new capability.  Couple this with the fact that we are dealing with semantics and language, and you quickly realize the applicability to the problem domain of Software Engineering and not just at the line-of-code or idiom levels of abstraction like what is currently being popularized with the coding co-pilots.  Essentially the incumbent paradigm and technologies - like GitHub are thinking about applying this capability in a very narrow and short-sighted way.  And while "all models are wrong (with LLM's being no exception), some are useful"

LLM's know about Use Cases.  They know about User Stories.  They know about the Unified Modelling Language (UML) and they know a lot about code syntax and semantics because of the ingestion of the corpus of StackOverflow.  We also know about hallucinations, inconsistency in the responses returned by LLM prompting.  This is to be expected until enough reinforcement learning through human feedback (RLHF) occurs on the raw internet corpus of information.

GIGO - garbage in, garbage out as the old adage goes.  We know that LLMs are pretty good at one-shot prompting, with results on AP-level tests approaching high 95+ percentiles.  What LLM's are not so good at is context and inference.  Meaning if you ask ChatGPT "what are the pros and cons of" pick your favorite software development practice or technology option you will get low value responses.  Similarly if asked to reason among choices you may not even get a response.  This is because what is necessary is another form of codified knowledge in concert with the LLM - Expert System knowledge.  Such a system leverages rules-based AI and an universe of discourse to aid the prompting.  When technologies are properly contextualized based on codified expert knowledge, Tree-of-Thought (ToT) prompting or set-based, one-shot prompting yields fantastic results. 

This is what Advisor AI Platform achieves, acting as a prompt driver for a contextualized and risk mitigated decision-tree.  Simple ad hoc prompting is currently able to deal with simple software creation.  However, to truly realize the vision of autonomous agency for architecting software intensive systems, an approach leveraging canned "views of generative prompts" is required.  This is a much deeper solution strategy for what is a very ambitious problem space.

 

  • Twitter Social Icon
  • LinkedIn Social Icon

© 2024 SDE - SoftwareFactory.ai

bottom of page