Patrick Loeber
Patrick Loeber

Patrick Loeber

Create a Note Taking App in Python with Speech Recognition and the Notion API

Create a Note Taking App in Python with Speech Recognition and the Notion API

Fun Python Tutorial with Speech Recognition

Patrick Loeber's photo
Patrick Loeber
·Nov 4, 2021·

4 min read

In this Python Tutorial we create a note taking app with speech recognition and the Notion API.

Our app will be able to run endlessly in the background and listen to the microphone input, and activates itself after a predefined activation command. Then we can say a command that will be stored as note in a Notion table.

1. Setup

We use the Speech Recognition library, and also gtts (Google's text-to-speech) so that our app can talk to us.

First we need the dependency pyaudio. On Mac you need these additional commands (See also Pyaudio Installation):

# Only on Mac:
$ brew install portaudio
$ pip install pyobjc
$ pip install pyaudio

Note: On a M1 Mac I had to use this command to install pyaudio:

$ python -m pip install --global-option='build_ext' --global-option='-I/opt/homebrew/Cellar/portaudio/19.7.0/include' --global-option='-L/opt/homebrew/Cellar/portaudio/19.7.0/lib' pyaudio

Then use:

$ pip install speechrecognition
$ pip install requests gtts playsound

2. Code the speech recognition

First, create a file main.py and implement the speech recognition workflow. We create two helper functions to listen to the microphone input and convert the audio input to text.

Additionally, we create one helper function that converts text to speech and plays a sound on the computer:

import speech_recognition as sr
import gtts
from playsound import playsound
import os
from datetime import datetime

r = sr.Recognizer()

def get_audio():
    with sr.Microphone() as source:
        print("Say something")
        audio = r.listen(source)
    return audio

def audio_to_text(audio):
    text = ""
    try:
        text = r.recognize_google(audio)
    except sr.UnknownValueError:
        print("Speech recognition could not understand audio")
    except sr.RequestError:
        print("could not request results from API")
    return text

def play_sound(text):
    try:
        tts = gtts.gTTS(text)
        tempfile = "./temp.mp3"
        tts.save(tempfile)
        playsound(tempfile)
        os.remove(tempfile)
    except AssertionError:
        print("could not play sound")

Now implement an endless loop that puts all the logic together:

  • Listen to the microphone input endlessly
  • If the activation command is recognized, listen to another microphone input
  • Use the second input as note that should be stored
ACTIVATION_COMMAND = "hey sam"

if __name__ == "__main__":

    while True:
        a = get_audio()
        command = audio_to_text(a)

        if ACTIVATION_COMMAND in command.lower():
            print("activate")
            play_sound("What can I do for you?")

            note = get_audio()
            note = audio_to_text(note)

            if note:
                play_sound(note)

                now = datetime.now().astimezone().isoformat()
                # TODO: save your note wherever you want, e.g. in Notion

3. Setup Notion API

Follow this guide to setup the Notion API: Setup Guide.

  • Step 1: Create Notion Integration and save API key
  • Step 2: Create a new database, and a new full page table within Notion
  • Step 3: Share the database with your integration, and save the database_id
  • Step 4: In your table, create three fields: "Description", "Date" (date field), and "Status" (text field)

4. Use the Notion API in Python

Create a new file notion.py and implement a small client class that uses the requests module. It sends a post request and stores a new entry in your notion database.

For more information you can also refer to the post-page example in the official docs.

import json
import requests


class NotionClient:

    def __init__(self, token, database_id) -> None:
        self.database_id = database_id

        self.headers = {
            "Authorization": "Bearer " + token,
            "Content-Type": "application/json",
            "Notion-Version": "2021-08-16"
        }

    # read, update
    def create_page(self, description, date, status):
        create_url = 'https://api.notion.com/v1/pages'

        data = {
        "parent": { "database_id": self.database_id },
        "properties": {
            "Description": {
                "title": [
                    {
                        "text": {
                            "content": description
                        }
                    }
                ]
            },
            "Date": {
                "date": {
                            "start": date,
                            "end": None
                        }
            },
            "Status": {
                "rich_text": [
                    {
                        "text": {
                            "content": status
                        }
                    }
                ]
            }
        }}

        data = json.dumps(data)
        res = requests.post(create_url, headers=self.headers, data=data)
        print(res.status_code)
        return res

5. Integrate Notion API into the app

The only thing left to do is now to use the notion client in the main.py file. So in main.py add the following code:

from notion import NotionClient

token = "YOUR NOTION TOKEN HERE"
database_id = "YOUR NOTION DATABASE_ID HERE"

client = NotionClient(token, database_id)

Then at the end of the code where we left the TODO add this:

if note:
    play_sound(note)

    now = datetime.now().astimezone().isoformat()
    res = client.create_page(note, now, status="Active")
    if res.status_code == 200:
        play_sound("Stored new item")

And that's it! Congratulations, you should now have a fully functioning note taking app with speech recognition!

I hope you enjoyed this project!

The full code can also be found on GitHub. If you want to connect, feel free to reach out on Twitter!

 
Share this