We built a Ramp MCP
server that lets you interact with your business data using natural language—by turning our developer API into a SQL interface that an LLM can query.
We asked Claude to “give me a detailed overview of my business’s spend in the past year.” The results were mind-blowing
Model Context Protocol (MCP) is an open-source standard developed by Anthropic that enables applications to expose data and functionality to LLMs. MCP provides an LLM with access to pretty much anything you want: databases, your computer’s filesystem, Figma, GitHub, etc. Given that Ramp’s developer API is the primary way to access Ramp’s resources externally, we wanted to see what would happen if we put the two together.
An MCP server in Python can be built using FastMCP
. MCP "tools" are akin to HTTP endpoints the LLM can access through an MCP client. The following piece of code, for example, is reminiscent of a Flask
HTTP endpoint decorated with @app.route(…)
and OpenAPI spec schemas.
from mcp.server.fastmcp import FastMCP
mcp = FastMCP("My MCP")
@mcp.tool(…) # tool name or function name used to annotate tool name
async def my_tool(
param_1: str, # params and their type hints are parsed to annotate the tool input schema
param_2: list[int],
param_3: Optional[dict[str, Any]] = None,
...
) -> str:
"""
Docstring is parsed to annotate the tool description
"""
results = await some_api(...) # expose your filesystem, some REST API, DB, etc.
return f"Do something with these results: {'\n'.join(results)}"
if __name__ == "__main__":
mcp.run(transport='stdio') # available with stdio or SSE transport
Using FastMCP
and a lightweight Ramp developer API client, we were able to build a working early prototype of the Ramp MCP server. With Claude Desktop as our MCP client, Claude was able to both generate visualizations and run simple analyses on spend data pulled from Ramp’s APIs using natural language. Adding a new source of data — like a different API endpoint — was as simple as defining a new tool. As a proof of concept, we had it issue cards on demand for a company offsite.
Our initial setup worked fine on a small demo business, but we quickly ran into a few scaling issues: miscalculations, limited context windows, input size limits, and high input token usage. At first we hacked together a simple pagination tool to chunk responses into smaller, digestible parts, but we struggled to scale meaningfully beyond a few hundred transactions. Then we asked Claude what it would prefer the data to look like, and it said it preferred predictable data formats that enabled server-side functions. That sounded like SQL: structured, predictable, and built for querying. Most importantly, Claude could now load as little raw data into its context window as possible, making the server do most of the heavy computational lifting. Thanks to the reduced token usage in this approach, Ramp MCP now even worked with the free version of Claude.
Everything aside from Developer API is running locally
In order to implement a SQL interface to analyze our API data, we built a lightweight in-memory ETL that pulls data from our APIs, transforms it, and loads it into an in-memory SQLite database.
To transform our RESTful API’s JSON responses into SQL rows, we flattened the JSON and inferred the appropriate SQL column type based on the value types. To keep things simple, we set missing keys as NULL
and cast lists to text.
{"id": 10, "amount": 123, "user": {"id": 1, "cardholder_first_name": "Eric"}}
becomes
id | user_cardholder_first_name | user_id | amount
10 | "Eric" | 1 | 123
Reporting use cases involve a ton of data and complex queries, often leading to timeouts. As a solve, we added an interface to an OLAP-powered API built by our data platform team. This allowed us to extract spend data which was specifically optimized for reporting use cases, solving the timeouts.
To expose all of this functionality to the LLM, we defined a few different tools:
load_transactions
) to pull data from the Ramp APIprocess_data
tool to transform the data from the API and load into the SQLite table (we could've skipped this step to reduce roundtrips)execute_query
to run queries on the in-memory database directlyAfter this move to a SQL-based paradigm, Claude went from struggling with a few hundred data points to accurately analyzing tens of thousands of spend events. It can now load as little or as much data as it needs while running aggregate or window functions to gain a better understanding of the data. With these changes, the latency for extracting data from the API became the largest bottleneck as tool calls started timing out.
Claude querying Ramp spend data from an in-memory SQLite database
LLMs are significantly better at SQL than at math, which is why this set of tools made sense for our RESTful API to expose relational spend data effectively. MCP allows you to create whatever interface works best for the type of data you want to connect to it, and it's on you to tinker a bit to figure out what the best data access strategy is.
Our open-source ramp-mcp
server can access a business context using a variety of tools, and enable natural language access to your Ramp data. It can create curated views of your spend, identify potential cost savings, and help you navigate your business hierarchy — all while surfacing insights we hadn’t even considered.
There are still limitations. There's the occasional reliability issue, and API latency can be high for very large businesses with a lot of data. There are plenty of optimizations we could implement (concurrent requests, pulling data from the APIs asynchronously, smart caching, using DuckDB, etc.), but building a scalable, and truly reliable agentic AI requires a great deal of complex technical work beyond what MCP currently offers. Write tools can be particularly unreliable, so we'll work on a safety framework before releasing tools that can perform actions in your business on your behalf as an agentic LLM.
MCP and similar technologies can introduce information security risk if left unchecked. LLMs can pull and understand large amounts of data from any source at their disposal. This means authentication credentials like API keys need to be secured, and following the principle of least privilege is as crucial as ever. As a mitigant, we implemented audit logging and you can pick a constrained set of OAuth scopes and tools to make available to the MCP client.
Keep in mind that LLMs themselves may not always pick the correct tools, or they may use them incorrectly. Prompt engineering may alleviate some of this issue, but in our testing Claude occasionally made mistakes even when given the same prompt twice across separate conversations.
MCP is early, with many limitations to overcome, and also a strong and growing community of engineers that can see the tip of the iceberg for how transformative this technology can be. Many MCP servers have been made available by businesses and enthusiasts alike, and anyone can connect their MCP client to a wide range of tools in their tech stack.
Client or server, I encourage you to give it a try. Miss out at your own peril!