Seeding a Database in Ruby on Rails

Jul 29, 2024

Ruby on Rails has excellent tools for seeding a database, and thanks to the work of the community, some gems make this task easier.

Apart from seeding a database, we have helpful tools to check the database and better organize the data seeds.

Creating a sample application

Let’s start by creating a new application:

rails new sample
cd sample

Creating a model

Next, generate a new model. If you are curious, by typing rails generate (or the rails g shortcut), you will see all the available generators.

bin/rails g model Movie title director storyline:text watched_on:date

Here you are setting the title and director as strings (the default type if not specified), storyline as text, and watched_on as date (when setting dates, not datetimes, the convention is to append on to the name).

Rails will generate a migration for you adapted to the default database, which is SQLite. Migrations are saved in db/migrate Let’s see how it looks like:

class CreateMovies < ActiveRecord::Migration[7.1]
  def change
    create_table :movies do |t|
      t.string :title
      t.string :director
      t.text :storyline
      t.date :watched_on

      t.timestamps
    end
  end
end

As you can see, Rails adds the version you are using in square brackets at the end of the parent class.

The timestamps statement will generate the created_at and updated_at fields automatically.

Let’s run it:

bin/rails db:migrate

Now Rails has actually created the table. Just in case you did something wrong, you can always rollback:

bin/rails db:rollback

This command accepts an optional step parameter to go back as many migrations as needed. For example, if you want to undo 2 migrations, you can do it like this:

bin/rails db:rollback STEP=2

Let’s see how the schema looks like after running the migration:

ActiveRecord::Schema[7.1].define(version: 2024_08_04_102643) do
  create_table "movies", force: :cascade do |t|
    t.string "title"
    t.string "director"
    t.text "storyline"
    t.date "watched_on"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end
end

This file will contain the entire database schema as we run more migrations.

Rails commands

By using the -T parameter you can see a list of the available Rails commands:

bin/rails -T

You can even filter by namespace, such as db:

bin/rails -T db

Creating some seeds

Let’s get to the interesting part of this article. Open the seeds file, and paste this:

Movie.destroy_all

Movie.create!([{
  title: "Inside Out",
  director: "Pete Docter",
  storyline: "As Riley’s family moves to a new city, her emotions —Joy, Sadness, Anger, Fear, and Disgust— navigate the challenges of adjusting to a new environment.",
  watched_on: 9.years.ago
},
{
  title: "Toy Story 4",
  director: "Josh Cooley",
  storyline: "Woody, Buzz Lightyear and the rest of the gang embark on a road trip with Bonnie and a new toy named Forky. The adventurous journey turns into an unexpected reunion as Woody's slight detour leads him to his long-lost friend Bo Peep.",
  watched_on: 5.years.ago
},
{
  title: "Soul",
  director: "Pete Docter",
  storyline: "After landing the gig of a lifetime, a New York jazz pianist suddenly finds himself trapped in a strange land between Earth and the afterlife.",
  watched_on: 4.years.ago
}])

p "Created #{Movie.count} movies"

First, you destroy all movies to have a clean state and add three movies passing an array to the create method. Those handy Ruby X.ago statements to define dates are provided by the Active Support Core Extensions.

In the end, there’s some feedback about the total movies created. Let’s run it!

bin/rails db:seed
# "Created 3 movies"

You can execute this command as many times as you need, since existing records are deleted thanks to the first line containing the destroy statement.

To check them, you can use rails runner:

bin/rails runner "p Movie.pluck :title"
# ["Inside Out", "Toy Story 4", "Soul"]

Using a custom Rails task to seed actual data

All your seeds are supposed to be development data, not actual data for production use. So, don’t seed in production the way you just did! Mainly because the first step deletes all movies!

To seed actual data, it is best to create a custom Rails task. Let’s generate one to add genres.

First generate the model and then migrate the database. Next, create the task.

bin/rails g model Genre name

bin/rails db:migrate

bin/rails g task movies seed_genres

This command creates a movies.rake file in the lib/tasks directory containing the seed_genres task.

Paste this code into the new rake file:

namespace :movies do
  desc "Seeds genres"
  task seed_genres: :environment do
    Genre.create!([{
      name: "Action"
    },
    {
      name: "Sci-Fi"
    },
    {
      name: "Adventure"
    }])

    p "Created #{Genre.count} genres"
  end
end

It’s now listed in the Rails commands list:

bin/rails -T movies

Time to run it!

bin/rails movies:seed_genres
# "Created 3 genres"

Loading seeds using the console

The console is handy for playing with your data. Let’s open it:

bin/rails c

Did you know that you can load and access your seeds from the inside? Try this:

Rails.application.load_seed

Playing with data using the console sandbox

Sometimes you will need to run destructive commands on real data in your development or production environment without making the changes permanent. It’s kind of like a safe mode where you can do whatever you want and then go to a previous state.

This mode is called sandbox, and you can access it with the command bin/rails c --sandbox

bin/rails c --sandbox

This technique is handy for debugging a real database, such as when a user says they are trying to update their profile name and see a weird error. You could reproduce that error directly using the sandbox mode without affecting the actual data.

Loading more seeds using Faker

If you need, for example, 100 movies, you can replace your seeds file with this:

Movie.destroy_all

100.times do |index|
  Movie.create!(title: "Title #{index}",
                director: "Director #{index}",
                storyline: "Storyline #{index}",
                watched_on: index.days.ago)
end

p "Created #{Movie.count} movies"

Now run the seed task:

bin/rails db:seed
# "Created 100 movies"

But the result does not look realistic at all:

bin/rails runner "p Movie.select(:title, :director, :storyline).last"
#<Movie id: nil, title: "Title 99", director: "Director 99", storyline: "Storyline 99">

Time to use Faker, a gem that generates random fake values. Add it into the development group in your Gemfile:

bundle add faker --group development

Run bundle install and replace your seeds file with this:

Movie.destroy_all

100.times do |index|
  Movie.create!(title: Faker::Movie.title,
                director: Faker::Name.name,
                storyline: Faker::Lorem.paragraph,
                watched_on: Faker::Time.between(from: 4.months.ago, to: 1.week.ago))
end

p "Created #{Movie.count} movies"

Check it out again:

bin/rails db:seed
# "Created 100 movies"
bin/rails runner "p Movie.select(:title, :director, :storyline, :watched_on).last"
#<Movie title: "The Big Lebowski", director: "Alfonso Kertzmann", storyline: "Sit ea aut. Iure iusto deserunt. Ratione autem aut...", watched_on: "2024-05-24", id: nil>

Much better!

Conclusion

Seeding the database while developing the application is essential, as it will have the feeling of working with actual data.

Also, knowing the tools available for working with seeds is good for your productiveness, so it is worth investing time in learning them.

If you are new to Ruby on Rails, I would like to recommend a course I created on LinkedIn Learning where I explain the basics of the framework from a practical point of view. You will learn the basics of Active Record, the Scaffold generator, the asset pipeline, ERB code, working with images with Active Storage, reactive code with Hotwire, and much more.

Hands-On Introduction: Ruby on Rails Start developing with the Ruby on Rails web framework, using a GitHub Codespaces environment.