> ## Documentation Index
> Fetch the complete documentation index at: https://docs-preprod.sambanova.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# InspectAI

[InspectAI](https://inspect.aisi.org.uk/) is an evaluation framework created by the [UK AI Security Institute](https://aisi.gov.uk/). It can be used to run a wide range of evaluations that measure coding, reasoning, agentic tasks, knowledge, behavior, and multimodal understanding. With InspectAI, evaluations and benchmarking become simple, reproducible, and consistent across multiple models and providers.

## **Prerequisites**

Before you begin, ensure you have:

1. A [SambaCloud](https://cloud.sambanova.ai/) account and an active API key at [SambaCloud API Keys](https://cloud.sambanova.ai/apis).
2. To use InspectAI with SambaNova, set your SambaNova API key as an environment variable:

```bash
export SAMBANOVA_API_KEY=your-sambacloud-api-key`
```

3. Python environment with required packages installed.

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install inspect-ai
pip install openai
```

## **Running evaluations**

Before you can run your first evaluation, you’ll need to define a task in a Python script.

Each task has three main components:

1. Dataset – the list of inputs and expected results
2. Solver – how the model produces its outputs
3. Scorer – how outputs are evaluated against the expected results

### Example: Hello world

Save the following code into a `hello_world.py` file.

```python
from inspect_ai import Task, task
from inspect_ai.dataset import Sample
from inspect_ai.scorer import exact
from inspect_ai.solver import generate

@task
def hello_world():
    return Task(
        dataset=[
            Sample(
                input="Just reply with Hello World",
                target="Hello World",
            )
        ],
        solver=[generate()],
        scorer=exact(),
    )
```

Then run the evaluation with SambaCloud. Here’s an example using the `Llama-4-Maverick-17B-128E-Instruct` model:

```bash
inspect eval hello_world.py --model sambanova/llama-4-maverick-17b-128e-instruct
```

## Viewing results

* Results are stored in the ./logs directory.
* Use the Inspect web UI for interactive viewing

```bash
inspect view
```

* You can also use the [Inspect Visual Studio Code extension](https://inspect.aisi.org.uk/log-viewer.html#vs-code-extension) for easier log exploration.

## More information

* See the [InspecAI repo](https://github.com/UKGovernmentBEIS/inspect_ai/tree/main/build) for more evaluation examples.
* For more details, see the official [InspectAI documentation](https://inspect.aisi.org.uk/reference/).
