Here’s to those who build

Saturday, November 15 > 2025
Alfândega Congress Center Porto, Portugal ‎

Highlights from Sword AI Summit 2024

About the event

Sword AI Summit Manifesto

At Sword, we deeply believe in democratizing access to fundamental matters, especially those upon which important things are built. Just like access to high-quality care, which helps you get your life back, access to knowledge and networking nurtures growth and innovation—both individually and collectively.

That’s why we set up the Sword AI Summit: the only AI event where, for a symbolic fee, you can access world-class experts building state-of-the-art AI.

Event

What you can look forward to

Keynotes for the builders

Talks on the latest developments in GenAI, including new models, agentic frameworks, and model serving platforms. Practical insights you can actually leverage in your work.

Connect with world-class practitioners

Opportunity to connect with those who are shaping the AI landscape by building state-of-the-art AI models, tools and real-world applications, and thus gain valuable insights straight from the source.

All hours displayed are in GMT
08:30–09:15 am
Doors open and check-in
09:15–09:30 am
Welcoming
09:30–09:45 am
Sword's AI Vision
Luís Ungaro (Sword Health)
09:45–10:10 am
Thinking about thinking: Algorithmic skills learning for LLMs
Michal Valko (Stealth): Metacognitive knowledge refers to humans’ intuitive understanding of their own thinking and reasoning processes. Today’s best LLMs clearly demonstrate reasoning skills‚ and evidence suggests they also hold metacognitive knowledge‚ including the ability to identify skills and procedures to apply in specific tasks. This talk explores how prompt-guided interactions allow LLMs to assign skill labels to math questions and cluster them into interpretable families. These methods improve accuracy on benchmarks such as GSM8k and MATH‚ including for code-assisted models.
10:10–10:35 am
Meta Agents Research Environments (ARE), scaling up agent environments and evaluations
Pierre Ménard (Meta): This talk introduces Meta Agents Research Environments (ARE)‚ a platform for building and evaluating intelligent agents across synthetic and real-world settings. ARE provides simple abstractions for creating diverse environments with their own tools‚ rules‚ and verifiers‚ enabling scalable experimentation and faster iteration. The presentation will also cover Gaia2‚ a benchmark developed within ARE to measure general agent capabilities‚ including reasoning under uncertainty‚ adaptation‚ and collaboration. Together‚ ARE and Gaia2 offer a foundation for more robust evaluation and meaningful progress in agent research.
10:35–11:05 am
Coffee break
11:05–11:30 am
The End of the Language Barrier? Machine Translation of Human Languages in the Age of LLMs
Markus Freitag (Google): Is Machine Translation a solved problem? In the age of LLMs‚ many in the field would say yes. This talk pushes back on that narrative. While LLMs have achieved remarkable performance‚ we argue that the task is far from complete. We will dissect the myth of "solved" translation‚ uncover the nuanced challenges that remain‚ and highlight the exciting new frontiers for machine translation research that have emerged in this new paradigm.
11:30–11:55 am
AI for code assistant: From completion to code agents
Baptiste Rozière (Mistral): This presentation explores how large language models can be trained to power code assistants. It presents key applications such as in-IDE code completion and agentic capabilities. The attendees will get an overview of the use of LLMs for code assistants‚ and some insight into pre-training and post-training methodologies.
11:55–02:00 pm
Lunch break
02:00–02:15 pm
Horizon Award
02:15–02:40 pm
Fantastic abstractions and where to find them
João Gante (Hugging Face): Reproducing AI research used to be a challenge‚ often taking days to replicate simple results. Today‚ open-access models with hundreds of billions of parameters can be integrated into products overnight. This talk examines how research has scaled in both quantity and complexity while becoming simpler to use. It emphasizes the central role abstractions play in enabling reliable research and production‚ from high-level frameworks down to the architectural layers of LLMs.
02:40–03:05 pm
Physical AI: The next frontier of AI
Teresa Conceição (NVIDIA): We are entering a new phase of AI — Physical AI — where robots and autonomous systems can perceive‚ understand‚ and act in the physical world. This session reveals what it takes to build and scale embodied intelligence‚ and how breakthroughs in generative AI‚ simulation platforms‚ and world foundation models are unlocking the next era of robotics.
03:05–03:25 pm
The AI Engine behind'd: From one-fits-all to personalised AI for mental health
Nuno Guerreiro (Sword Health): Effective mental health support requires adapting to each individual’s patterns‚ maintaining consistency over time‚ and building genuine trust. Most AI approaches follow a one-fits-all model and fail to deliver. This talk presents how Sword built Mind‚ a personalised AI therapist trained with character‚ biometric data and ecosystem signals to provide engaging‚ proactive and therapeutic support‚ diving into how AI can move beyond generic interactions toward truly personalised care in mental health.
03:25–03:45 pm
Phoenix: Building an agent for clinical-grade care
Diogo Gonçalves (Sword Health): Language is becoming the most natural interface for human–AI interaction‚ but designing agents that deliver safe‚ structured and clinically relevant guidance is a unique challenge. Phoenix‚ Sword’s AI Care Specialist‚ combines speech recognition‚ LLMs‚ and text-to-speech to guide patients through therapy sessions. This talk presents how Phoenix was designed to ensure clinical safety‚ contextual awareness and trust‚ validated with clinicians and deployed in millions of sessions — and what this means for scaling AI-driven healthcare globally.
03:45–04:25 pm
Coffee break
04:25–04:45 pm
Open stage
04:45–05:10 pm
What we measure is what we build: Rethinking evaluation of LLMs
Marzieh Fadaee (Cohere Labs): Evaluating large language models is far from straightforward. Traditional benchmarks expose limitations in measuring reasoning‚ robustness and generalisation. This talk explores the challenges of evaluation‚ from trade-offs between human and automated metrics to the risks of leaderboard overfitting. It highlights new directions in multilingual and task-specific evaluation‚ and argues for pluralistic‚ context-aware strategies that align with real-world use cases instead of chasing a single performance score.
05:10–05:35 pm
Closing the loop: From scaffolding and evals to reinforcement fine-tuning
Theophile Sautory (OpenAI): Models alone are not enough to create reliable AI systems. They require scaffolding‚ evaluation frameworks‚ and feedback loops to evolve. This talk explores how building the right tools and grading frameworks creates the signals for reinforcement fine-tuning‚ and how this progression lays the groundwork for more reliable and adaptive AI systems.
05:35–06:00 pm
From vibe to value: How AI PMs and AI engineers collaborate on Eevals
Aman Khan (Arize AI): AI products rely on rigorous evaluations‚ but success depends on collaboration between product managers and engineers. This session presents a practical playbook for building evals together — a process that separates successful AI deployments from failed experiments. Drawing from hands-on experience‚ it highlights how shared ownership of eval design creates trustworthy products. This picks up from the post Aman did with Lenny Beyond vibe checks: A PM’s complete guide to evals.
06:00–06:05 pm
Donation moment
06:05–06:20 pm
Closing keynote

FAQ

Know before you go

Stay tuned by following our LinkedIn

  • Who is the event for?

  • How much does it cost to attend?

  • When and where is the event?

  • How is the lunch break going to work?

  • What is the dress code?

  • How can I contact the organization?

  • What is the best way to get to the venue?