reverse engineering granola for obsidian export
granola <> obsidian
for those unaware, granola is a tool that can be used to generate summary notes about a meeting
one unfortunate thing though is that i am already quite loyal to obsidian as the designated place where i store notes, so i wanted to see if there was a way to port my granola notes there.
intercepting granola's traffic
first thing i learned is that granola notes are not stored locally, but instead in their own cloud. this meant that getting my notes out of granola was a bit less straightforward, and that i needed to do more poking and prodding.
i decided to use mitmproxy to set up a local proxy to intercept & inspect requests.
brew install --cask mitmproxy
after installing, i ran mitmproxy
to generate a root certificate. mitmproxy performs a man-in-the-middle interception of HTTPS traffic by generating fake TLS certificates for each site you visit.
by default, your system and browser will not trust mitmproxy's certificate, so we'll have to manually trust it by running this command:
sudo security add-trusted-cert -d -r trustRoot \
-k /Library/Keychains/System.keychain \
~/.mitmproxy/mitmproxy-ca-cert.pem
next, we need to configure granola to route through our proxy. set your network proxy settings to point to 127.0.0.1:8080
(mitmproxy's default), then launch granola.
on mac, you can do this in your system settings by going to Wi-Fi > Details.. > and setting your proxy settings to this:
reverse-engineering the api
in mitmproxy, there's a lot going on. you can make your life easier by making use the filter tool by pressing f
and filtering for api.granola.ai
investigation
- /v1/get-document-set returns a json with the ids of all the documents you own
- /v1/get-document-metadata fetches the title of our document
- it seems that documents are not stored in one blob, and instead are an aggregation of "panels", which i believe are the various h1 chunks you see in the actual note
- the request body of /v1/update-document-panel and /v1/create-document-panel contain the actual content of the document
authorization
every request to api.granola.ai
carries the same authorization header. i grabbed the bearer token and stored it in an .env
file to be used later by our script.
building the exporter
after some investigation, i asked my best friend and close confidant mr. opus to write a python script to automate the export process. this is a working script at the time of this writing. also, granola seems to have a token refresh mechanism in place. my token became stale ~2 weeks later, so this is not a true true automation. however, feel free to use this as a starting point if the api schema changes down the line.
ps, as a chronically online information hoarder, i am always looking for better systems to consume, organize, and retrieve the many gigabytes of information i come across every day. if you have opinionated obsidian/note taking workflows, i'd like to chat :)
before running the script, create a python virtual environment & pip install requests
voila! happy information-hoarding! big shoutout to Joseph Thacker for inspiration
#!/usr/bin/env python3
"""
Export Granola meeting notes → Markdown files.
Requirements
------------
python -m pip install requests
Environment variable GRANOLA_BEARER_TOKEN must be set.
"""
import os
import sys
import json
import logging
from pathlib import Path
from datetime import datetime
import requests
# -----------------------------------------------------------------------------
# CONFIG
# -----------------------------------------------------------------------------
OUTPUT_DIR = Path.home() / "Desktop/obsidian/granola" # change if you want
LIST_URL = "https://api.granola.ai/v1/get-document-set"
META_URL = "https://api.granola.ai/v1/get-document-metadata" # returns panel_ids
PANEL_URL = "https://api.granola.ai/v1/get-document-panel" # returns panel content
PANELS_URL = "https://api.granola.ai/v1/get-document-panels" # try plural version
ALT_PANEL_URL = "https://api.granola.ai/v1/get-document" # alternative endpoint
LIST_PAGE_SIZE = 100 # Granola accepts 100 max
# -----------------------------------------------------------------------------
# HELPERS
# -----------------------------------------------------------------------------
log = logging.getLogger("granola_export")
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s %(levelname)s %(message)s",
datefmt="%H:%M:%S"
)
def load_env_file():
"""Load environment variables from granola_config.env if it exists."""
env_file = Path(__file__).parent / "granola_config.env"
if env_file.exists():
log.info("Loading config from %s", env_file)
with open(env_file, 'r') as f:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, value = line.split('=', 1)
os.environ[key] = value
log.info("Loaded environment variable: %s", key)
else:
log.warning("Config file not found: %s", env_file)
def bearer_token() -> str:
# Load config file first
load_env_file()
tok = os.getenv("GRANOLA_BEARER_TOKEN")
if not tok:
log.error("GRANOLA_BEARER_TOKEN environment variable is not set.")
sys.exit(1)
log.info("Using bearer token: %s...%s", tok[:20], tok[-10:])
return tok
HEADERS = {
"Content-Type": "application/json",
"Accept": "*/*",
"User-Agent": "Granola/6.72.0",
"X-Client-Version": "6.72.0",
"Authorization": f"Bearer {bearer_token()}"
}
def prosemirror_to_md(node):
"""Very small converter – enough for headings, paragraphs, bullet lists."""
if not isinstance(node, dict):
return ""
t = node.get("type")
if t == "text":
return node.get("text", "")
if t == "paragraph":
return "".join(prosemirror_to_md(c) for c in node.get("content", [])) + "\n\n"
if t == "heading":
level = node.get("attrs", {}).get("level", 1)
inner = "".join(prosemirror_to_md(c) for c in node.get("content", []))
return f"{'#'*level} {inner}\n\n"
if t == "bulletList":
lines = []
for li in node.get("content", []):
if li.get("type") == "listItem":
txt = "".join(prosemirror_to_md(c) for c in li.get("content", []))
lines.append(f"- {txt.strip()}")
return "\n".join(lines) + "\n\n"
# fall-through: recurse
return "".join(prosemirror_to_md(c) for c in node.get("content", []))
def sanitize_filename(s: str) -> str:
bad = '<>:"/\\|?*'
return "".join(c for c in s if c not in bad).replace(" ", "_") or "Untitled"
# -----------------------------------------------------------------------------
# MAIN FLOW
# -----------------------------------------------------------------------------
def fetch_doc_ids():
"""Paginate through /v1/get-document-set until no IDs left."""
ids = []
offset = 0
while True:
payload = {"limit": LIST_PAGE_SIZE, "offset": offset}
log.info("Making API request to %s with payload: %s", LIST_URL, payload)
r = requests.post(LIST_URL, headers=HEADERS, json=payload, timeout=15)
log.info("Response status: %s", r.status_code)
log.info("Response headers: %s", dict(r.headers))
r.raise_for_status()
page = r.json()
log.info("Response JSON: %s", page)
# The API returns documents as an object with IDs as keys
documents_obj = page.get("documents", {})
batch = list(documents_obj.keys()) if documents_obj else []
log.info("Found batch of %d documents", len(batch))
ids.extend(batch)
if len(batch) < LIST_PAGE_SIZE:
break
offset += LIST_PAGE_SIZE
return ids
def fetch_metadata(doc_id):
"""Returns metadata dict that includes panel_ids."""
r = requests.post(
META_URL,
headers=HEADERS,
json={"document_id": doc_id},
timeout=15
)
r.raise_for_status()
metadata = r.json()
log.info("Metadata for doc %s: %s", doc_id[:8], metadata)
return metadata
def fetch_panel(panel_id):
"""Fetch panel content from panel ID."""
r = requests.post(
PANEL_URL,
headers=HEADERS,
json={"panel_id": panel_id},
timeout=15
)
r.raise_for_status()
return r.json()["panel"]["content"] # ProseMirror JSON
def ensure_output_dir():
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
log.info("Saving Markdown to %s", OUTPUT_DIR)
def save_doc(meta, doc_id):
title = meta.get("title") or "Untitled Granola Note"
# Try different approaches to get content
log.info("Processing: %s", title)
# First, let's see if there's content directly in metadata
if "content" in meta:
panel_content = meta["content"]
md_body = prosemirror_to_md(panel_content)
else:
# Try multiple approaches to get panel content
panel_content = None
# Approach 1: Try get-document-panels (plural) with document_id - THIS WORKS!
try:
r = requests.post(
PANELS_URL,
headers=HEADERS,
json={"document_id": doc_id},
timeout=15
)
if r.status_code == 200:
panels_data = r.json()
# Try to extract content from panels response
if isinstance(panels_data, list) and len(panels_data) > 0:
panel_content = panels_data[0].get("content")
elif isinstance(panels_data, dict) and "panels" in panels_data:
panels = panels_data["panels"]
if panels and len(panels) > 0:
panel_content = panels[0].get("content")
except Exception as e:
log.debug("get-document-panels failed: %s", e)
# Approach 2: Try alternative get-document endpoint (fallback)
if not panel_content:
try:
r = requests.post(
ALT_PANEL_URL,
headers=HEADERS,
json={"document_id": doc_id},
timeout=15
)
if r.status_code == 200:
doc_data = r.json()
panel_content = doc_data.get("content")
except Exception as e:
log.debug("get-document failed: %s", e)
if panel_content:
md_body = prosemirror_to_md(panel_content)
else:
log.warning("Could not retrieve content for %s", title)
# Create a basic markdown file with just the title and metadata
md_body = f"# {title}\n\n*Content could not be retrieved from Granola API*\n\n"
fm = [
"---",
f"granola_id: {doc_id}",
f"title: \"{title.replace('\"', '\\\"')}\"",
]
for field in ("created_at", "updated_at"):
if meta.get(field):
fm.append(f"{field}: {meta[field]}")
fm.append("---\n")
final = "\n".join(fm) + md_body
fn = OUTPUT_DIR / f"{sanitize_filename(title)}.md"
fn.write_text(final, encoding="utf-8")
log.info("wrote %s", fn.name)
def main():
ensure_output_dir()
log.info("Fetching document IDs …")
ids = fetch_doc_ids()
log.info("Found %d docs", len(ids))
for i, doc_id in enumerate(ids, 1):
try:
meta = fetch_metadata(doc_id)
save_doc(meta, doc_id)
except Exception as e:
log.exception("doc %s failed: %s", doc_id, e)
if __name__ == "__main__":
main()