Google Colab: Jupyter with AI

Published: Jan 2, 2024 by Isaac Johnson

Colab is Google’s Jupyter offering. It is based on Jupyter Notebook, a popular tool for interactive coding.

You can use Colab to create and execute Python scripts, as well as text, images, HTML, LaTeX, and more. You can also use Colab to learn from various tutorials as well as make her own.

Usage

To start, go to Colab.research.google.com.

/content/images/2024/01/Colaboratory-01.png

We can then try Generative AI

/content/images/2024/01/Colaboratory-02.png

Let’s try generating a python script to load a Redis host

I entered the text:

random generated 5 Mb text file that is then loaded 100 times into a Redis server specified by an environment variable using python

and it came back with

# prompt: random generated 5 Mb text file that is then loaded 100 times into a redis server specified by an environment variable using python

import os
import random
import string
import redis

def generate_text(length):
  """Generates a random text file of the specified length."""
  chars = string.ascii_letters + string.digits
  return ''.join(random.choice(chars) for i in range(length))

def load_text(filename, redis_server):
  """Loads the specified text file into the redis server."""
  with open(filename, 'r') as f:
    text = f.read()
  redis_server.set(filename, text)

def main():
  """Generates a random text file, loads it into a redis server 100 times, and then prints the size of the redis server."""
  length = 5 * 1024 * 1024  # 5 MB
  filename = 'random_text.txt'
  redis_server = redis.Redis(host=os.environ['REDIS_HOST'], port=os.environ['REDIS_PORT'])
  generate_text(length)
  for i in range(100):
    load_text(filename, redis_server)
  print(redis_server.info()['used_memory'])

if __name__ == '__main__':
  main()

/content/images/2024/01/Colaboratory-03.png

If we want to save it, we have some options

/content/images/2024/01/Colaboratory-04.png

Such as Google Drive

/content/images/2024/01/Colaboratory-05.png

To use this python, I’ll need to install the redis python module

/content/images/2024/01/Colaboratory-06.png

We can use !pip install redis

/content/images/2024/01/Colaboratory-07.png

Now we try running the python, we see it loads redis but complains about the lack of set keys

/content/images/2024/01/Colaboratory-08.png

If I want to try it locally, I can download the Python

/content/images/2024/01/Colaboratory-09.png

Since Chrome wants to be a nanny now, I have to double click

/content/images/2024/01/Colaboratory-10.png

And then I can see it locally

builder@LuiGi17:/mnt/c/Users/isaac/Downloads$ cat copy_of_untitled0.py
# -*- coding: utf-8 -*-
"""Copy of Untitled0.ipynb

Automatically generated by Colaboratory.

Original file is located at
    https://colab.research.google.com/drive/1Uc99zAGAlEQHx__CoVak4r5mwiAkiNQI
"""

!pip install redis

# prompt: random generated 5 Mb text file that is then loaded 100 times into a redis server specified by an environment variable using python

import os
import random
import string
import redis

def generate_text(length):
  """Generates a random text file of the specified length."""
  chars = string.ascii_letters + string.digits
  return ''.join(random.choice(chars) for i in range(length))

def load_text(filename, redis_server):
  """Loads the specified text file into the redis server."""
  with open(filename, 'r') as f:
    text = f.read()
  redis_server.set(filename, text)

def main():
  """Generates a random text file, loads it into a redis server 100 times, and then prints the size of the redis server."""
  length = 5 * 1024 * 1024  # 5 MB
  filename = 'random_text.txt'
  redis_server = redis.Redis(host=os.environ['REDIS_HOST'], port=os.environ['REDIS_PORT'])
  generate_text(length)
  for i in range(100):
    load_text(filename, redis_server)
  print(redis_server.info()['used_memory'])

if __name__ == '__main__':

I decided to try one more time to use Colab to solve my problem and generate a Dockerfile

Dockerfile for python that installs the redis python module first and expects two environment vars named REDIS_HOST and REDIS_PORT

/content/images/2024/01/Colaboratory-11.png

The Dockerfile doesn’t quite seem right

FROM python:3.7-slim

RUN apt-get update && apt-get install -y redis-server

RUN pip install redis

CMD python /app/main.py

Let’s compare that with Copilot (Bing AI):

/content/images/2024/01/Colaboratory-12.png

# Use python:3 as the base image
FROM python:3

# Set the working directory to /app
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . /app

# Install the redis python module using pip
RUN pip install redis

# Set the environment variables for the redis host and port
ENV REDIS_HOST some-redis
ENV REDIS_PORT 6379

# Run the python script when the container launches
CMD ["python", "redis_test_script.py"]

And while I didn’t ask for it, it also kicked out a python script I could use

/content/images/2024/01/Colaboratory-13.png

import redis
import os

# Get the redis host and port from the environment variables
redis_host = os.environ.get("REDIS_HOST", "localhost")
redis_port = int(os.environ.get("REDIS_PORT", "6379"))

# Create a redis client object
r = redis.Redis(host=redis_host, port=redis_port, db=0)

# Test the connection by setting and getting a key
r.set("foo", "bar")
print(r.get("foo"))

I also asked Copilot for the same Python script with 5mb of random data:

import os
import random
import string
import redis

# Generate a random 5 Mb text file
file_name = "random.txt"
file_size = 5 * 1024 * 1024 # 5 Mb in bytes
chars = string.ascii_letters + string.digits # Possible characters to use
with open(file_name, "w") as f:
    for i in range(file_size):
        f.write(random.choice(chars)) # Write a random character

# Get the redis host and port from the environment variable
redis_host = os.environ.get("REDIS_HOST", "localhost")
redis_port = int(os.environ.get("REDIS_PORT", "6379"))

# Create a redis client object
r = redis.Redis(host=redis_host, port=redis_port, db=0)

# Load the file 100 times into the redis server
for i in range(100):
    key = f"file:{i}" # Use a different key for each file
    with open(file_name, "rb") as f:
        value = f.read() # Read the file content as bytes
        r.set(key, value) # Set the key-value pair in redis

Okay, so we now see how Google Colab worked out a Python script. How does it compare to their other AI tool, Bard?

It actually generated even better code;

import os
import redis
import random
import string

# Get Redis server address from environment variable
REDIS_HOST = os.environ.get("REDIS_HOST", "localhost")
REDIS_PORT = 6379  # Default Redis port

# Connect to Redis
redis_client = redis.Redis(host=REDIS_HOST, port=REDIS_PORT)

# Generate 5 MB of random text
random_text = ''.join(random.choice(string.ascii_letters + string.digits) for _ in range(5 * 1024 * 1024))

# Load the text into Redis 100 times with unique keys
for i in range(100):
    key = f"random_text_{i}"
    redis_client.set(key, random_text)

print(f"Generated 5 MB random text and loaded it 100 times into Redis at {REDIS_HOST}:{REDIS_PORT}")

It also included a nice write up of the variables as well

/content/images/2024/01/Colaboratory-14.png

I asked about the Dockerfile:

We can see it is pretty usable code:

FROM python:3.9-slim  # Base image with Python 3.9

WORKDIR /app  # Set working directory

COPY requirements.txt .  # Copy requirements.txt to the container

RUN pip install -r requirements.txt  # Install Redis module

# Declare environment variables
ENV REDIS_HOST localhost
ENV REDIS_PORT 6379

COPY . .  # Copy the rest of your application code

# Define the command to run when the container starts
CMD ["python", "your_script.py"]  # Replace with your Python script

with a full writeup

/content/images/2024/01/Colaboratory-15.png

Side-by-side we can see how similar they are

/content/images/2024/01/Colaboratory-16.png

And for comparison, OpenAI gave me back

import os
import random
import string
import redis

# Function to generate a random text of given size
def generate_random_text(size):
    return ''.join(random.choices(string.ascii_letters + string.digits, k=size))

# Function to generate a random 5 MB text file
def generate_text_file(filename):
    text = generate_random_text(5 * 1024 * 1024)  # 5 MB in bytes
    with open(filename, 'w') as file:
        file.write(text)

# Function to load a text file into Redis
def load_into_redis(redis_client, filename):
    with open(filename, 'r') as file:
        content = file.read()
        redis_client.set(filename, content)

if __name__ == "__main__":
    # Set the Redis server details from environment variables
    redis_host = os.environ.get("REDIS_HOST", "localhost")
    redis_port = int(os.environ.get("REDIS_PORT", 6379))
    
    # Generate and load the text file into Redis 100 times
    for i in range(100):
        filename = f"random_text_{i}.txt"
        generate_text_file(filename)
        
        # Connect to the Redis server
        redis_client = redis.Redis(host=redis_host, port=redis_port)
        
        # Load the text file into Redis
        load_into_redis(redis_client, filename)
        
        # Delete the local text file
        os.remove(filename)
        
    print("Data loaded into Redis successfully.")

It’s kind of wild how different all four ended up being:

/content/images/2024/01/Colaboratory-17.png

Colab text

I tried using some markdown, which worked. However, adding an image seemed to nearly crash the browser

/content/images/2024/01/Colaboratory-18.png

and seemed to insert the ASCII equivalent of the file

/content/images/2024/01/Colaboratory-19.png

/content/images/2024/01/Colaboratory-20.png

When I came back, I saw the image - perhaps the slow bandwidth of my remote connection where I’m writing now was a problem

/content/images/2024/01/Colaboratory-21.png

I tried to ask for something newer I could use in the Text block

create a mermaidjs data flow diagram showing the states of start, processing and complete with arrows between

but no luck

/content/images/2024/01/Colaboratory-22.png

I used Copilot to get me some valid code

/content/images/2024/01/Colaboratory-23.png

Then tried it

/content/images/2024/01/Colaboratory-24.png

I guess it’s limited there as far as markdown driven mermaid.

However, credit to this writeup which showed how to use Python to generate Mermaid graphs in Jupyter notebooks

import base64
from IPython.display import Image, display
import matplotlib.pyplot as plt

def mm(graph):
  graphbytes = graph.encode("ascii")
  base64_bytes = base64.b64encode(graphbytes)
  base64_string = base64_bytes.decode("ascii")
  display(
    Image(
      url="https://mermaid.ink/img/"
      + base64_string
    )
  )

mm("""
graph LR;
    A--> B & C & D;
    B--> A & E;
    C--> A & E;
    D--> A & E;
    E--> B & C & D;
""")

Which worked fabulously

/content/images/2024/01/Colaboratory-25.png

Ultimately, this is really what I need - the ability to write some code I can ‘play’ (execute), some formatted text and lastly some diagrams to explain more complicated ideas.

I like MermaidJS primarily because it is widely used so there are plenty of examples and I’m pretty certain it will live for a long while.

Next, I’m curious how well sharing works. I created a link to this notebook for viewing

/content/images/2024/01/Colaboratory-26.png

Which worked in an InPrivate window without issue

/content/images/2024/01/Colaboratory-27.png

Paid Options

Colab, by default, is free. If you use a lot of compute credits, there are a few paid options as well

/content/images/2024/01/Colaboratory-28.png

and we can see what that translates to USD on the signup page

/content/images/2024/01/Colaboratory-29.png

Summary

So we took a basic tour of Google Colab and Jupyter Notebooks in general. We looked at a comparison of the code generated by Colab and three other AIs; namely Bard, Bing/Copilot and OpenAI. Lastly, we explored some of the text features of Colab including Markdown, images and MermaidJS (Diagrams).

Credit to Sparisoma Viridi for the great collection of Mermaid+Python examples. It’s worth adding a bookmark.

Jupyter Colab

Have something to add? Feedback? You can use the feedback form

Isaac Johnson

Isaac Johnson

Cloud Solutions Architect

Isaac is a CSA and DevOps engineer who focuses on cloud migrations and devops processes. He also is a dad to three wonderful daughters (hence the references to Princess King sprinkled throughout the blog).

Theme built by C.S. Rhymes