I needed to make a custom dynamic provider for Faker Python and the docs really only provided a surface level explanation on how to do this. Other resources were either not very helpful for what I wanted to accomplish or too hard to find (At least I found them to be so…).

As such, the motivation behind this writing is, “Things I wish I could have found in my exhaustive search.”

In this blog post I will get into some Basics First (feel free to skip) before I try For a Moment to frame the problem I had. Finally, I’ll Spill the Beans in hopes that someone out there will one day find it useful.

Basics First


Faker is a library that provides fuctions for generating (You Guessed It!) fake data. To use Faker, you first import the Faker and use it to generate a fake address like so:

from faker import Faker

fake = Fake()

print(fake.address())

# Outputs:
# 02358 Allen Meadow Suite 901
# New Jessica, OK 81306

Reading the docs you find that Faker generates these fake addresses using provider classes. Faker comes with many standard providers built in and has several more community providers available to install via pip.

As an example we can install a community provider for Airtravel:

pip install faker-airtravel

Then using python:

from faker import Faker
from faker_airtravel import AirTravelProvider

fake = Faker()

fake.add_provider(AirTravelProvider)

print(fake.airport_name())

# Outputs:
# Geneva airport

For A Moment


Great! With Faker we have standard providers and some community ones!

This should be all we need right?Wrong!

Sometimes you need to make your own provider. The docs go into some detail on how to accomplish this but really only scratch the surface.

From the Faker docs (circa February 2023 AD):

from faker import Faker
from faker.providers import DynamicProvider

medical_professions_provider = DynamicProvider(
     provider_name="medical_profession",
     elements=["dr.", "doctor", "nurse", "surgeon", "clerk"],
)

fake = Faker()

# then add new provider to faker instance
fake.add_provider(medical_professions_provider)

# now you can use:
fake.medical_profession()
# 'dr.'

This was great if you just needed a quick way to randomly select from a list but what if you wanted something more dynamic?

Spill the Beans


Extending the BaseProvider class we can define lists of elements that we can then return by calling from the Faker object.

In the below example MedicalProfessionalProvider extends the BaseProvider class and two lists are defined, prefix_elements and suffix_elements.

from faker import Faker
from faker.providers import BaseProvider, ElementsType

fake = Faker()


class MedicalProfessionalProvider(BaseProvider):

    prefix_elements: ElementsType[str] = [
        "Dr.", "Doctor", "Nurse", "Surgeon", "Clerk"]

    suffix_elements: ElementsType[str] = ["MD", "MD, PhD", "PA", "DO"]

Then we define two methods that return a random element from the list of elements we previously defined. Here we have suffix and prefix methods that return a random suffix and prefix respectively.

from faker import Faker
from faker.providers import BaseProvider, ElementsType

fake = Faker()


class MedicalProfessionalProvider(BaseProvider):

    prefix_elements: ElementsType[str] = [
        "Dr.", "Doctor", "Nurse", "Surgeon", "Clerk"]

    suffix_elements: ElementsType[str] = ["MD", "MD, PhD", "PA", "DO"]

    def prefix(self) -> str:
        return self.random_element(self.prefix_elements)

    def suffix(self) -> str:
        return self.random_element(self.suffix_elements)

Then we can define patterns which have the names of elements in between double brackets that we can later parse with the generator.

For example:

# the pattern
    "{{prefix}} {{first_name}} {{last_name}}"

# will be parsed into
    "Dr. John Smith"

In the below example name_patterns is a list of patterns. The method medical_professional_name selects a random pattern from the list of patterns and parses it with the generator to return more specific names of medical professionals.

It is important to note that the variables defined in the patterns are the methods defined in the class.

It is also important to note that variables in the pattern don’t have to be exclusive to the methods defined in the encompassing class. they can reference anything available to the entire faker object. In the below example first_name and last_name are variables that are defined by the person standard provider but I was able to reference them anyway.

from faker import Faker
from faker.providers import BaseProvider, ElementsType

fake = Faker()


class MedicalProfessionalProvider(BaseProvider):

    prefix_elements: ElementsType[str] = [
        "Dr.", "Doctor", "Nurse", "Surgeon", "Clerk"]

    suffix_elements: ElementsType[str] = ["MD", "MD, PhD", "PA", "DO"]

    name_patterns: ElementsType[str] = [
        "{{prefix}} {{first_name}} {{last_name}}",
        "{{first_name}} {{last_name}} {{suffix}}"
    ]

    def medical_professional_name(self) -> str:
        pattern = self.random_element(self.name_patterns)
        return self.generator.parse(pattern)

    def prefix(self) -> str:
        return self.random_element(self.prefix_elements)

    def suffix(self) -> str:
        return self.random_element(self.suffix_elements)


fake.add_provider(MedicalProfessionalProvider)

for _ in range(10):
    print(fake.medical_professional_name())


# Outputs:
# Doctor Brittany Lee
# Doctor Loretta Brown
# Surgeon Ian Gomez
# Daniel Wilson MD
# Donald Weiss MD
# Jennifer Ballard PA
# Surgeon Amber Grant
# Clerk Dennis Gaines
# Andrew Webb DO
# Anthony Thornton MD, PhD