Voice-Control Your Notes with AI

Difficulty: ★☆☆☆☆ Getting Started · Estimated time: ~30 minutes

You’re in a meeting and an idea strikes. Instead of opening an app, switching windows, finding the right note, and typing it all out… you just say:

“Add to my daily note: Sarah mentioned a job opening at Xero”

And it appears in Obsidian. No typing, no clicking, no context switching. That’s what we’re building. A voice-first daily notes workflow where you speak naturally and AI handles the rest — capturing thoughts, tracking tasks, and searching your vault.

Tutorial led by Chan Meng — Senior AI/ML Engineer, open-source contributor, and former ByteDance developer. Chan has built 30+ live applications and specialises in AI-powered solutions. She is also a panel speaker at this event and the developer behind this website.

What you will build

Speak

Capture thoughts by voice — say what you want to remember and it lands in your daily note

Command

Manage tasks with natural language — add them, check them off, review what is left

Find

Search your notes by asking — no need to remember file names or folder structures

How it works

You speak naturally (or type, if you prefer). Wispr Flow converts your voice to text. Gemini CLI understands what you want and runs the right Obsidian commands behind the scenes. Your notes update instantly — you never need to learn or type a single command.

You can either speak your prompts using Wispr Flow, or type/paste them into Gemini CLI. Both work exactly the same way. Wispr Flow is optional — it just makes the experience hands-free. Every prompt in this tutorial works whether you speak it or type it.

What you will learn

How to control Obsidian using natural language through Gemini CLI
How to capture thoughts instantly by speaking or typing a request
How to add and manage tasks without memorising any commands
How to search across all your notes by simply asking
How to use voice input with Wispr Flow for a hands-free workflow
How to build a simple daily productivity habit with AI

No coding required. Every step uses natural language you can say out loud or copy and paste. If you can describe what you want, you can do this.

Tools

Gemini CLI

Google’s free AI assistant that runs in your terminal. It understands your natural language requests and translates them into actions.

Wispr Flow

Optional voice input tool — speak instead of type. Works in any application, including your terminal.

Obsidian

A free note-taking app that stores your notes as plain text files on your computer. Your data stays with you — no cloud account required.

Node.js

A free tool needed to install Gemini CLI. One-time setup.

Terminal

The command-line app built into your computer. On macOS it is called Terminal; on Windows it is called PowerShell or Command Prompt.

Cost

Tool	Cost
Gemini CLI	Free (1,000 requests/day)
Wispr Flow	Free trial (invite link for a free month of Pro)
Obsidian	Free
Node.js	Free
Terminal	Free (built into your computer)
Total	$0

Prerequisites

A laptop

Windows or macOS. No special hardware needed.

30 minutes

Take your time — there is no rush.

Curiosity

No prior experience needed. Just a willingness to try something new.

Ready to get started? Head to Set up your tools to install everything you need.

Overview

★ Getting Started

★★ Easy

★★★ Intermediate

★★★★ Challenging

★★★★★ Advanced

Build a Full-Stack AI Site (8 Weeks)

What you will build

Speak

Command

Find

How it works

What you will learn

Tools

Gemini CLI

Wispr Flow

Obsidian

Node.js

Terminal

Cost

Prerequisites

A laptop

30 minutes

Curiosity

​What you will build

Speak

Command

Find

​How it works

​What you will learn

​Tools

Gemini CLI

Wispr Flow

Obsidian

Node.js

Terminal

​Cost

​Prerequisites

A laptop

30 minutes

Curiosity

What you will build

How it works

What you will learn

Tools

Cost

Prerequisites