SVP Innovation
Research Scientist
Co-Founder & CTO
Research Scientist
A fundamental characteristic of an AI agent is its intelligence in taking actions, checking outcomes against those actions, and improving itself if those actions result in divergence from its goals. Self-improvement, where an agent autonomously improves its own functioning, has intrigued the AI community for several decades.
We believe that building robust agent-based systems that are scalable and maintainable requires that agents can autonomously adapt and improve. Self-improvement can happen at multiple levels of sophistication. For simplicity of exploration, let us consider two categories of improvement:
While these categories inevitably form a continuum (e.g. depending on how broadly the environment is defined), they offer one structured way of classifying specific self-improvement scenarios. To illustrate, let us consider a few examples:
Fundamentally, self-improvement can be abstracted into either acquisition or improvement of a variety of capabilities within the scope of an agent’s goals. Whether it is narrow or broad self-improvement, the acquisition and enhancement of capabilities need to done systematically to maximize the goal achievement, while ensuring that the capabilities learnt will not allow the agents to perform actions that violate the values the agent is expected to uphold.Value Alignment refers to the problem of ensuring that AI systems produce outcomes that are consistent with human values and preferences. While there has been much well-known recent work on the alignment of “static” non-self-improving agents, the problem of aligning self-improving agents is significantly harder and more critical.
As a thought experiment, consider a formalized notion of intelligence defined in [3], where is the environment and is the agent policy, the intelligence is a discounted product of the Kolmogorov complexity of the environment and value function .
Briefly, the above argues for a universal prior over environments/goals ordered by complexity, with an agent’s intelligence measured by a weighted sum of performance over this set. Let’s consider how self-improvement and alignment may fit in the above.
Per this thought experiment, an incorporation of misalignment is given by the following aligned intelligence . We assume a new value function which incorporates the reward for being aligned to .
A possible reward function that uses , the distance measure between the agent policy and alignment policy distribution , where larger distances (misalignment) are penalized is as shown below where is an appropriate reward function defined to capture the distance of rewards between and . For example, a simple distance reward could be where for small distances it goes to and for large distances, it goes to .
From the expression above, a policy that provides the agent with large rewards from the environment but is not aligned with its constitutional imperatives, will yield a lower overall value. Whereas, if both the alignment and environment rewards are in line, the (modified) value function can be large even though the environment reward may have been lower to start with.
By continuously learning, adapting, and optimizing their actions, agents can make better decisions, streamline processes, and ultimately bring in huge productivity gains in enterprise workflows. However, it is important to approach the development and deployment of self-improving agents with caution, ensuring that ethical considerations, transparency, and accountability are prioritized. Since alignment to a constitution by an agent could be a conflicting goal with continuous self-improvement, it is important to ensure that self-improvement is done more systematically.
[1] https://www.lesswrong.com/tag/recursive-self-improvement
[3] Legg, S., & Hutter, M. (2007). Universal Intelligence: A Definition of Machine Intelligence. Minds and Machines, 17, 391-444.
Eager to apply more sophisticated agentic memory to the largest conversational benchmark, LongMemEval, we discuss the benchmark, our approach, our somewhat disappointing state of the art findings, and the need for a more comprehensive benchmark for agentic memory than LongMemEval.
LongMemEval is highlighted as the premier benchmark for evaluating long-term memory, surpassing simple tasks with its complex requirements. Despite this, our RAG-like methods have achieved state-of-the-art results, suggesting that while LongMemEval is effective, it may not fully capture all aspects of memory, indicating a need for further benchmark development.
Emergence AI agents are revolutionizing cybersecurity by autonomously correlating vast telemetry data, detecting threats in real time, automating compliance monitoring, and orchestrating efficient SOC operations while reducing manual workloads and enhancing decision-making. By acting as tireless digital teammates, these agents empower organizations to build a scalable, resilient, and proactive security posture fit for today’s complex threat landscape.