Sentinel At present This tutorial has a related video class created by the Existent Python team. Watch information technology together with the written tutorial to deepen your understanding: Editing Excel Spreadsheets in Python With openpyxl

Excel spreadsheets are one of those things you might have to deal with at some bespeak. Either it'south considering your boss loves them or because marketing needs them, yous might have to learn how to piece of work with spreadsheets, and that'southward when knowing openpyxl comes in handy!

Spreadsheets are a very intuitive and convenient mode to manipulate large datasets without any prior technical background. That'southward why they're notwithstanding then commonly used today.

In this article, you'll learn how to use openpyxl to:

  • Manipulate Excel spreadsheets with confidence
  • Extract information from spreadsheets
  • Create uncomplicated or more circuitous spreadsheets, including calculation styles, charts, and so on

This article is written for intermediate developers who accept a pretty good noesis of Python information structures, such as dicts and lists, but as well feel comfortable around OOP and more intermediate level topics.

Before You Begin

If you e'er get asked to extract some data from a database or log file into an Excel spreadsheet, or if you oft have to convert an Excel spreadsheet into some more usable programmatic form, then this tutorial is perfect for you. Let's jump into the openpyxl caravan!

Practical Use Cases

Offset things showtime, when would yous need to use a package like openpyxl in a real-world scenario? You'll see a few examples below, merely actually, there are hundreds of possible scenarios where this noesis could come in handy.

Importing New Products Into a Database

You are responsible for tech in an online shop company, and your boss doesn't want to pay for a cool and expensive CMS organization.

Every fourth dimension they want to add new products to the online store, they come to you lot with an Excel spreadsheet with a few hundred rows and, for each of them, you lot have the product proper name, description, toll, and so forth.

Now, to import the data, you lot'll have to iterate over each spreadsheet row and add each product to the online store.

Exporting Database Information Into a Spreadsheet

Say you take a Database table where you record all your users' information, including name, phone number, email address, and and then forth.

Now, the Marketing team wants to contact all users to give them some discounted offer or promotion. Still, they don't accept admission to the Database, or they don't know how to use SQL to excerpt that information hands.

What can you do to assistance? Well, you can make a quick script using openpyxl that iterates over every single User tape and puts all the essential information into an Excel spreadsheet.

That's gonna earn y'all an extra piece of cake at your company'southward next birthday party!

Appending Information to an Existing Spreadsheet

You may besides accept to open a spreadsheet, read the information in it and, according to some business logic, suspend more data to it.

For example, using the online store scenario again, say yous get an Excel spreadsheet with a list of users and you need to append to each row the full corporeality they've spent in your shop.

This data is in the Database and, in guild to do this, yous accept to read the spreadsheet, iterate through each row, fetch the total amount spent from the Database and then write back to the spreadsheet.

Non a trouble for openpyxl!

Learning Some Basic Excel Terminology

Here's a quick list of basic terms you'll run into when you're working with Excel spreadsheets:

Term Explanation
Spreadsheet or Workbook A Spreadsheet is the main file you lot are creating or working with.
Worksheet or Sheet A Sheet is used to split different kinds of content within the same spreadsheet. A Spreadsheet can take ane or more Sheets.
Cavalcade A Column is a vertical line, and it'south represented by an capital letter: A.
Row A Row is a horizontal line, and it'south represented by a number: 1.
Cell A Cell is a combination of Column and Row, represented past both an uppercase letter of the alphabet and a number: A1.

Getting Started With openpyxl

At present that yous're aware of the benefits of a tool like openpyxl, let's get down to it and start by installing the package. For this tutorial, you should utilise Python 3.seven and openpyxl 2.6.ii. To install the package, yous tin do the post-obit:

After yous install the bundle, you should be able to create a super unproblematic spreadsheet with the following code:

                                                  from                  openpyxl                  import                  Workbook                  workbook                  =                  Workbook                  ()                  sheet                  =                  workbook                  .                  active                  sheet                  [                  "A1"                  ]                  =                  "hello"                  sheet                  [                  "B1"                  ]                  =                  "world!"                  workbook                  .                  relieve                  (                  filename                  =                  "hello_world.xlsx"                  )                              

The code in a higher place should create a file called hello_world.xlsx in the folder you are using to run the code. If you open up that file with Excel you should see something like this:

A Simple Hello World Spreadsheet

Woohoo, your first spreadsheet created!

Reading Excel Spreadsheets With openpyxl

Let'southward start with the most essential thing i can do with a spreadsheet: read it.

You'll go from a straightforward approach to reading a spreadsheet to more complex examples where you read the information and convert it into more useful Python structures.

Dataset for This Tutorial

Before you dive deep into some code examples, yous should download this sample dataset and store it somewhere equally sample.xlsx:

This is one of the datasets you'll be using throughout this tutorial, and information technology'due south a spreadsheet with a sample of real information from Amazon's online product reviews. This dataset is just a tiny fraction of what Amazon provides, but for testing purposes, it'southward more than enough.

A Simple Arroyo to Reading an Excel Spreadsheet

Finally, let's start reading some spreadsheets! To begin with, open our sample spreadsheet:

>>>

                                                  >>>                                    from                  openpyxl                  import                  load_workbook                  >>>                                    workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  >>>                                    workbook                  .                  sheetnames                  ['Canvas 1']                  >>>                                    sheet                  =                  workbook                  .                  agile                  >>>                                    sheet                  <Worksheet "Sail 1">                  >>>                                    sheet                  .                  championship                  'Sail 1'                              

In the lawmaking above, you beginning open up the spreadsheet sample.xlsx using load_workbook(), and so you can utilise workbook.sheetnames to see all the sheets y'all have available to piece of work with. Subsequently that, workbook.active selects the showtime available sheet and, in this case, y'all tin encounter that it selects Sheet i automatically. Using these methods is the default fashion of opening a spreadsheet, and y'all'll encounter it many times during this tutorial.

Now, later on opening a spreadsheet, you can easily remember data from it like this:

>>>

                                                  >>>                                    sail                  [                  "A1"                  ]                  <Prison cell 'Sheet 1'.A1>                  >>>                                    sheet                  [                  "A1"                  ]                  .                  value                  'market place'                  >>>                                    sheet                  [                  "F10"                  ]                  .                  value                  "G-Shock Men'southward Grey Sport Watch"                              

To render the actual value of a cell, you demand to exercise .value. Otherwise, you'll get the primary Cell object. You tin can too use the method .cell() to recall a jail cell using index annotation. Remember to add .value to get the bodily value and non a Jail cell object:

>>>

                                                  >>>                                    canvas                  .                  cell                  (                  row                  =                  10                  ,                  column                  =                  6                  )                  <Cell 'Sheet 1'.F10>                  >>>                                    sheet                  .                  cell                  (                  row                  =                  x                  ,                  column                  =                  6                  )                  .                  value                  "G-Shock Men's Grey Sport Watch"                              

You can see that the results returned are the same, no affair which way y'all make up one's mind to get with. However, in this tutorial, y'all'll exist generally using the kickoff approach: ["A1"].

The above shows you the quickest style to open a spreadsheet. However, you can pass additional parameters to change the fashion a spreadsheet is loaded.

Additional Reading Options

There are a few arguments you can pass to load_workbook() that change the manner a spreadsheet is loaded. The most important ones are the following two Booleans:

  1. read_only loads a spreadsheet in read-only manner assuasive you to open very big Excel files.
  2. data_only ignores loading formulas and instead loads but the resulting values.

Importing Data From a Spreadsheet

Now that you've learned the basics virtually loading a spreadsheet, information technology'south near time you lot get to the fun part: the iteration and actual usage of the values within the spreadsheet.

This section is where you'll larn all the dissimilar ways you can iterate through the data, but too how to convert that data into something usable and, more than importantly, how to do information technology in a Pythonic mode.

Iterating Through the Data

There are a few dissimilar means yous tin iterate through the data depending on your needs.

Yous can slice the data with a combination of columns and rows:

>>>

                                                        >>>                                        sheet                    [                    "A1:C2"                    ]                    ((<Cell 'Sheet one'.A1>, <Prison cell 'Sheet 1'.B1>, <Jail cell 'Sheet i'.C1>),                                          (<Cell 'Canvass 1'.A2>, <Cell 'Sheet 1'.B2>, <Cell 'Sheet 1'.C2>))                                  

You can go ranges of rows or columns:

>>>

                                                        >>>                                        # Go all cells from column A                    >>>                                        canvas                    [                    "A"                    ]                    (<Cell 'Sheet ane'.A1>,                                          <Cell 'Sheet ane'.A2>,                                          ...                                          <Cell 'Sheet 1'.A99>,                                          <Jail cell 'Sheet 1'.A100>)                    >>>                                        # Get all cells for a range of columns                    >>>                                        sheet                    [                    "A:B"                    ]                    ((<Cell 'Sheet 1'.A1>,                                          <Cell 'Sail ane'.A2>,                                          ...                                          <Cell 'Canvass i'.A99>,                                          <Cell 'Sheet 1'.A100>),                                          (<Cell 'Sheet 1'.B1>,                                          <Cell 'Sheet 1'.B2>,                                          ...                                          <Cell 'Sheet 1'.B99>,                                          <Prison cell 'Sheet ane'.B100>))                    >>>                                        # Get all cells from row five                    >>>                                        canvass                    [                    5                    ]                    (<Cell 'Sheet ane'.A5>,                                          <Cell 'Canvas 1'.B5>,                                          ...                                          <Prison cell 'Sheet 1'.N5>,                                          <Cell 'Canvas one'.O5>)                    >>>                                        # Get all cells for a range of rows                    >>>                                        sheet                    [                    v                    :                    6                    ]                    ((<Cell 'Sail i'.A5>,                                          <Cell 'Sheet ane'.B5>,                                          ...                                          <Cell 'Sheet ane'.N5>,                                          <Prison cell 'Canvas i'.O5>),                                          (<Cell 'Canvas 1'.A6>,                                          <Cell 'Sheet 1'.B6>,                                          ...                                          <Jail cell 'Sheet 1'.N6>,                                          <Prison cell 'Sheet 1'.O6>))                                  

Y'all'll notice that all of the in a higher place examples render a tuple. If you want to refresh your retentiveness on how to handle tuples in Python, check out the commodity on Lists and Tuples in Python.

There are also multiple means of using normal Python generators to go through the data. The main methods you tin can utilize to reach this are:

  • .iter_rows()
  • .iter_cols()

Both methods can receive the following arguments:

  • min_row
  • max_row
  • min_col
  • max_col

These arguments are used to set boundaries for the iteration:

>>>

                                                        >>>                                        for                    row                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    i                    ,                    ...                                        max_row                    =                    2                    ,                    ...                                        min_col                    =                    1                    ,                    ...                                        max_col                    =                    3                    ):                    ...                                        print                    (                    row                    )                    (<Cell 'Sheet ane'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet 1'.C1>)                    (<Prison cell 'Sail ane'.A2>, <Cell 'Sheet one'.B2>, <Prison cell 'Sheet 1'.C2>)                    >>>                                        for                    cavalcade                    in                    sheet                    .                    iter_cols                    (                    min_row                    =                    1                    ,                    ...                                        max_row                    =                    2                    ,                    ...                                        min_col                    =                    ane                    ,                    ...                                        max_col                    =                    3                    ):                    ...                                        print                    (                    column                    )                    (<Prison cell 'Canvas ane'.A1>, <Cell 'Sheet ane'.A2>)                    (<Prison cell 'Canvas 1'.B1>, <Cell 'Sheet 1'.B2>)                    (<Cell 'Canvass one'.C1>, <Cell 'Sheet ane'.C2>)                                  

You'll notice that in the outset example, when iterating through the rows using .iter_rows(), you lot get one tuple chemical element per row selected. While when using .iter_cols() and iterating through columns, you'll get one tuple per column instead.

One additional argument you can laissez passer to both methods is the Boolean values_only. When information technology'south fix to True, the values of the cell are returned, instead of the Cell object:

>>>

                                                        >>>                                        for                    value                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    1                    ,                    ...                                        max_row                    =                    two                    ,                    ...                                        min_col                    =                    1                    ,                    ...                                        max_col                    =                    iii                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('marketplace', 'customer_id', 'review_id')                    ('US', 3653882, 'R3O9SGZBVQBV76')                                  

If you want to iterate through the whole dataset, and then you tin can as well use the attributes .rows or .columns directly, which are shortcuts to using .iter_rows() and .iter_cols() without any arguments:

>>>

                                                        >>>                                        for                    row                    in                    sheet                    .                    rows                    :                    ...                                        print                    (                    row                    )                    (<Cell 'Sheet 1'.A1>, <Cell 'Sheet 1'.B1>, <Cell 'Sheet i'.C1>                    ...                    <Cell 'Sheet 1'.M100>, <Prison cell 'Sail ane'.N100>, <Cell 'Canvass 1'.O100>)                                  

These shortcuts are very useful when you're iterating through the whole dataset.

Manipulate Data Using Python's Default Data Structures

At present that you know the basics of iterating through the information in a workbook, let'southward look at smart ways of converting that data into Python structures.

As you saw earlier, the upshot from all iterations comes in the form of tuples. However, since a tuple is cipher more than an immutable listing, you can hands admission its data and transform it into other structures.

For instance, say y'all want to extract product data from the sample.xlsx spreadsheet and into a dictionary where each key is a product ID.

A straightforward fashion to do this is to iterate over all the rows, choice the columns you know are related to product information, and then store that in a dictionary. Let's code this out!

First of all, have a await at the headers and see what information you intendance most almost:

>>>

                                                        >>>                                        for                    value                    in                    sheet                    .                    iter_rows                    (                    min_row                    =                    ane                    ,                    ...                                        max_row                    =                    one                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        impress                    (                    value                    )                    ('marketplace', 'customer_id', 'review_id', 'product_id', ...)                                  

This lawmaking returns a list of all the column names you have in the spreadsheet. To outset, grab the columns with names:

  • product_id
  • product_parent
  • product_title
  • product_category

Lucky for you, the columns yous need are all next to each other so you can use the min_column and max_column to easily go the information you desire:

>>>

                                                        >>>                                        for                    value                    in                    canvas                    .                    iter_rows                    (                    min_row                    =                    two                    ,                    ...                                        min_col                    =                    4                    ,                    ...                                        max_col                    =                    seven                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('B00FALQ1ZC', 937001370, 'Invicta Women\'s 15150 "Affections" 18k Yellow...)                    ('B00D3RGO20', 484010722, "Kenneth Cole New York Women's KC4944...)                    ...                                  

Nice! Now that you know how to get all the important product information you demand, let's put that data into a dictionary:

                                                        import                    json                    from                    openpyxl                    import                    load_workbook                    workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    )                    sheet                    =                    workbook                    .                    agile                    products                    =                    {}                    # Using the values_only because you desire to return the cells' values                    for                    row                    in                    sail                    .                    iter_rows                    (                    min_row                    =                    two                    ,                    min_col                    =                    4                    ,                    max_col                    =                    7                    ,                    values_only                    =                    True                    ):                    product_id                    =                    row                    [                    0                    ]                    production                    =                    {                    "parent"                    :                    row                    [                    1                    ],                    "championship"                    :                    row                    [                    2                    ],                    "category"                    :                    row                    [                    three                    ]                    }                    products                    [                    product_id                    ]                    =                    product                    # Using json here to be able to format the output for displaying later on                    impress                    (                    json                    .                    dumps                    (                    products                    ))                                  

The code above returns a JSON like to this:

                                                        {                    "B00FALQ1ZC"                    :                    {                    "parent"                    :                    937001370                    ,                    "title"                    :                    "Invicta Women's 15150 ..."                    ,                    "category"                    :                    "Watches"                    },                    "B00D3RGO20"                    :                    {                    "parent"                    :                    484010722                    ,                    "title"                    :                    "Kenneth Cole New York ..."                    ,                    "category"                    :                    "Watches"                    }                    }                                  

Hither you lot can see that the output is trimmed to 2 products simply, but if you run the script as information technology is, then you should get 98 products.

Convert Data Into Python Classes

To finalize the reading section of this tutorial, let'south dive into Python classes and come across how you lot could improve on the example higher up and ameliorate structure the data.

For this, y'all'll be using the new Python Data Classes that are bachelor from Python 3.7. If you lot're using an older version of Python, so you can employ the default Classes instead.

Then, first things outset, let's look at the data you have and make up one's mind what you want to store and how y'all desire to store it.

As you lot saw right at the start, this data comes from Amazon, and it'due south a listing of product reviews. Y'all tin can check the listing of all the columns and their meaning on Amazon.

There are two significant elements you can extract from the data available:

  1. Products
  2. Reviews

A Product has:

  • ID
  • Title
  • Parent
  • Category

The Review has a few more than fields:

  • ID
  • Customer ID
  • Stars
  • Headline
  • Torso
  • Date

You can ignore a few of the review fields to make things a bit simpler.

And so, a straightforward implementation of these ii classes could exist written in a split up file classes.py:

                                                        import                    datetime                    from                    dataclasses                    import                    dataclass                    @dataclass                    class                    Production                    :                    id                    :                    str                    parent                    :                    str                    title                    :                    str                    category                    :                    str                    @dataclass                    class                    Review                    :                    id                    :                    str                    customer_id                    :                    str                    stars                    :                    int                    headline                    :                    str                    trunk                    :                    str                    engagement                    :                    datetime                    .                    datetime                                  

Later defining your data classes, you lot need to convert the data from the spreadsheet into these new structures.

Earlier doing the conversion, information technology'due south worth looking at our header again and creating a mapping between columns and the fields you lot demand:

>>>

                                                        >>>                                        for                    value                    in                    canvass                    .                    iter_rows                    (                    min_row                    =                    ane                    ,                    ...                                        max_row                    =                    1                    ,                    ...                                        values_only                    =                    True                    ):                    ...                                        print                    (                    value                    )                    ('marketplace', 'customer_id', 'review_id', 'product_id', ...)                    >>>                                        # Or an alternative                    >>>                                        for                    prison cell                    in                    canvass                    [                    i                    ]:                    ...                                        print                    (                    jail cell                    .                    value                    )                    marketplace                    customer_id                    review_id                    product_id                    product_parent                    ...                                  

Permit'southward create a file mapping.py where you take a list of all the field names and their column location (zero-indexed) on the spreadsheet:

                                                        # Product fields                    PRODUCT_ID                    =                    3                    PRODUCT_PARENT                    =                    four                    PRODUCT_TITLE                    =                    5                    PRODUCT_CATEGORY                    =                    half-dozen                    # Review fields                    REVIEW_ID                    =                    two                    REVIEW_CUSTOMER                    =                    1                    REVIEW_STARS                    =                    7                    REVIEW_HEADLINE                    =                    12                    REVIEW_BODY                    =                    13                    REVIEW_DATE                    =                    14                                  

You don't necessarily take to do the mapping above. Information technology's more than for readability when parsing the row data, and so y'all don't end up with a lot of magic numbers lying around.

Finally, allow's await at the code needed to parse the spreadsheet data into a list of product and review objects:

                                                        from                    datetime                    import                    datetime                    from                    openpyxl                    import                    load_workbook                    from                    classes                    import                    Production                    ,                    Review                    from                    mapping                    import                    PRODUCT_ID                    ,                    PRODUCT_PARENT                    ,                    PRODUCT_TITLE                    ,                    \                    PRODUCT_CATEGORY                    ,                    REVIEW_DATE                    ,                    REVIEW_ID                    ,                    REVIEW_CUSTOMER                    ,                    \                    REVIEW_STARS                    ,                    REVIEW_HEADLINE                    ,                    REVIEW_BODY                    # Using the read_only method since you're non gonna be editing the spreadsheet                    workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    ,                    read_only                    =                    True                    )                    sheet                    =                    workbook                    .                    agile                    products                    =                    []                    reviews                    =                    []                    # Using the values_only considering you just want to return the jail cell value                    for                    row                    in                    sail                    .                    iter_rows                    (                    min_row                    =                    2                    ,                    values_only                    =                    True                    ):                    production                    =                    Product                    (                    id                    =                    row                    [                    PRODUCT_ID                    ],                    parent                    =                    row                    [                    PRODUCT_PARENT                    ],                    title                    =                    row                    [                    PRODUCT_TITLE                    ],                    category                    =                    row                    [                    PRODUCT_CATEGORY                    ])                    products                    .                    append                    (                    production                    )                    # You need to parse the date from the spreadsheet into a datetime format                    spread_date                    =                    row                    [                    REVIEW_DATE                    ]                    parsed_date                    =                    datetime                    .                    strptime                    (                    spread_date                    ,                    "%Y-%k-                    %d                    "                    )                    review                    =                    Review                    (                    id                    =                    row                    [                    REVIEW_ID                    ],                    customer_id                    =                    row                    [                    REVIEW_CUSTOMER                    ],                    stars                    =                    row                    [                    REVIEW_STARS                    ],                    headline                    =                    row                    [                    REVIEW_HEADLINE                    ],                    torso                    =                    row                    [                    REVIEW_BODY                    ],                    date                    =                    parsed_date                    )                    reviews                    .                    append                    (                    review                    )                    print                    (                    products                    [                    0                    ])                    print                    (                    reviews                    [                    0                    ])                                  

After you lot run the lawmaking above, you lot should go some output similar this:

                                                        Product                    (                    id                    =                    'B00FALQ1ZC'                    ,                    parent                    =                    937001370                    ,                    ...                    )                    Review                    (                    id                    =                    'R3O9SGZBVQBV76'                    ,                    customer_id                    =                    3653882                    ,                    ...                    )                                  

That's it! Now you should have the information in a very simple and digestible grade format, and you lot can starting time thinking of storing this in a Database or whatever other type of data storage you lot like.

Using this kind of OOP strategy to parse spreadsheets makes handling the data much simpler later.

Appending New Data

Before you kickoff creating very complex spreadsheets, have a quick look at an case of how to append data to an existing spreadsheet.

Become dorsum to the first case spreadsheet you created (hello_world.xlsx) and try opening information technology and appending some data to it, like this:

                                                  from                  openpyxl                  import                  load_workbook                  # Start by opening the spreadsheet and selecting the chief sheet                  workbook                  =                  load_workbook                  (                  filename                  =                  "hello_world.xlsx"                  )                  sheet                  =                  workbook                  .                  active                  # Write what you want into a specific cell                  sheet                  [                  "C1"                  ]                  =                  "writing ;)"                  # Save the spreadsheet                  workbook                  .                  save                  (                  filename                  =                  "hello_world_append.xlsx"                  )                              

Et voilà, if you open the new hello_world_append.xlsx spreadsheet, you'll meet the following change:

Appending Data to a Spreadsheet

Notice the additional writing ;) on cell C1.

Writing Excel Spreadsheets With openpyxl

In that location are a lot of unlike things you lot tin can write to a spreadsheet, from simple text or number values to complex formulas, charts, or even images.

Let's start creating some spreadsheets!

Creating a Simple Spreadsheet

Previously, you saw a very quick example of how to write "Howdy world!" into a spreadsheet, so you tin can starting time with that:

                                                                      one                  from                  openpyxl                  import                  Workbook                                      two                                      3                  filename                  =                  "hello_world.xlsx"                                      4                                      5                                      workbook                    =                    Workbook                    ()                                                        6                  canvass                  =                  workbook                  .                  active                                      7                                      8                                      sheet                    [                    "A1"                    ]                    =                    "hello"                                                        9                                      sheet                    [                    "B1"                    ]                    =                    "world!"                                    x                  11                                      workbook                    .                    save                    (                    filename                    =                    filename                    )                                                

The highlighted lines in the code above are the most important ones for writing. In the code, yous can see that:

  • Line 5 shows you how to create a new empty workbook.
  • Lines eight and 9 show you how to add information to specific cells.
  • Line 11 shows you how to save the spreadsheet when you're done.

Even though these lines above tin be straightforward, information technology's still good to know them well for when things get a bit more complicated.

Ane thing yous can do to assistance with coming code examples is add the following method to your Python file or console:

>>>

                                                  >>>                                    def                  print_rows                  ():                  ...                                    for                  row                  in                  sheet                  .                  iter_rows                  (                  values_only                  =                  True                  ):                  ...                                    print                  (                  row                  )                              

It makes it easier to print all of your spreadsheet values past only calling print_rows().

Basic Spreadsheet Operations

Earlier yous get into the more than avant-garde topics, it's salubrious to know how to manage the nigh simple elements of a spreadsheet.

Adding and Updating Cell Values

You already learned how to add values to a spreadsheet like this:

>>>

                                                        >>>                                        sheet                    [                    "A1"                    ]                    =                    "value"                                  

There'due south another way y'all tin do this, past get-go selecting a cell and and then changing its value:

>>>

                                                        >>>                                        cell                    =                    canvass                    [                    "A1"                    ]                    >>>                                        cell                    <Prison cell 'Sail'.A1>                    >>>                                        cell                    .                    value                    'hello'                    >>>                                        cell                    .                    value                    =                    "hey"                    >>>                                        prison cell                    .                    value                    'hey'                                  

The new value is just stored into the spreadsheet once you call workbook.save().

The openpyxl creates a cell when calculation a value, if that cell didn't be before:

>>>

                                                        >>>                                        # Earlier, our spreadsheet has only 1 row                    >>>                                        print_rows                    ()                    ('hello', 'world!')                    >>>                                        # Try adding a value to row 10                    >>>                                        canvass                    [                    "B10"                    ]                    =                    "test"                    >>>                                        print_rows                    ()                    ('hello', 'earth!')                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    (None, 'exam')                                  

As yous can meet, when trying to add a value to cell B10, yous stop up with a tuple with 10 rows, but so you can have that exam value.

Managing Rows and Columns

One of the nigh common things you lot have to do when manipulating spreadsheets is adding or removing rows and columns. The openpyxl bundle allows you to practice that in a very straightforward way by using the methods:

  • .insert_rows()
  • .delete_rows()
  • .insert_cols()
  • .delete_cols()

Every unmarried one of those methods can receive two arguments:

  1. idx
  2. amount

Using our basic hello_world.xlsx example once again, permit's see how these methods work:

>>>

                                                        >>>                                        print_rows                    ()                    ('hello', 'world!')                    >>>                                        # Insert a cavalcade earlier the existing column 1 ("A")                    >>>                                        sheet                    .                    insert_cols                    (                    idx                    =                    1                    )                    >>>                                        print_rows                    ()                    (None, 'hello', 'globe!')                    >>>                                        # Insert 5 columns between column two ("B") and 3 ("C")                    >>>                                        sheet                    .                    insert_cols                    (                    idx                    =                    3                    ,                    amount                    =                    5                    )                    >>>                                        print_rows                    ()                    (None, 'hello', None, None, None, None, None, 'earth!')                    >>>                                        # Delete the created columns                    >>>                                        sheet                    .                    delete_cols                    (                    idx                    =                    3                    ,                    amount                    =                    5                    )                    >>>                                        sail                    .                    delete_cols                    (                    idx                    =                    1                    )                    >>>                                        print_rows                    ()                    ('hello', 'world!')                    >>>                                        # Insert a new row in the get-go                    >>>                                        canvass                    .                    insert_rows                    (                    idx                    =                    1                    )                    >>>                                        print_rows                    ()                    (None, None)                    ('hello', 'world!')                    >>>                                        # Insert 3 new rows in the get-go                    >>>                                        sheet                    .                    insert_rows                    (                    idx                    =                    i                    ,                    amount                    =                    3                    )                    >>>                                        print_rows                    ()                    (None, None)                    (None, None)                    (None, None)                    (None, None)                    ('hello', 'world!')                    >>>                                        # Delete the first four rows                    >>>                                        sheet                    .                    delete_rows                    (                    idx                    =                    1                    ,                    amount                    =                    4                    )                    >>>                                        print_rows                    ()                    ('hello', 'world!')                                  

The only thing y'all need to call back is that when inserting new information (rows or columns), the insertion happens before the idx parameter.

Then, if you lot do insert_rows(1), it inserts a new row earlier the existing commencement row.

It's the aforementioned for columns: when y'all call insert_cols(ii), it inserts a new column correct before the already existing second cavalcade (B).

Still, when deleting rows or columns, .delete_... deletes information starting from the index passed as an argument.

For example, when doing delete_rows(two) it deletes row ii, and when doing delete_cols(iii) it deletes the tertiary cavalcade (C).

Managing Sheets

Sheet management is too one of those things you lot might demand to know, even though it might be something that y'all don't use that often.

If you look back at the code examples from this tutorial, yous'll detect the following recurring piece of code:

This is the mode to select the default sheet from a spreadsheet. However, if you're opening a spreadsheet with multiple sheets, then you tin always select a specific i like this:

>>>

                                                        >>>                                        # Let's say you accept ii sheets: "Products" and "Company Sales"                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        # Y'all can select a canvass using its title                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        sales_sheet                    =                    workbook                    [                    "Company Sales"                    ]                                  

You lot can also modify a sheet title very hands:

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        products_sheet                    .                    title                    =                    "New Products"                    >>>                                        workbook                    .                    sheetnames                    ['New Products', 'Company Sales']                                  

If you lot want to create or delete sheets, and so you lot can also do that with .create_sheet() and .remove():

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                    >>>                                        operations_sheet                    =                    workbook                    .                    create_sheet                    (                    "Operations"                    )                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales', 'Operations']                    >>>                                        # You lot tin also define the position to create the sheet at                    >>>                                        hr_sheet                    =                    workbook                    .                    create_sheet                    (                    "HR"                    ,                    0                    )                    >>>                                        workbook                    .                    sheetnames                    ['HR', 'Products', 'Visitor Sales', 'Operations']                    >>>                                        # To remove them, only pass the sheet as an statement to the .remove()                    >>>                                        workbook                    .                    remove                    (                    operations_sheet                    )                    >>>                                        workbook                    .                    sheetnames                    ['Hour', 'Products', 'Visitor Sales']                    >>>                                        workbook                    .                    remove                    (                    hr_sheet                    )                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales']                                  

I other thing you tin do is make duplicates of a sheet using copy_worksheet():

>>>

                                                        >>>                                        workbook                    .                    sheetnames                    ['Products', 'Visitor Sales']                    >>>                                        products_sheet                    =                    workbook                    [                    "Products"                    ]                    >>>                                        workbook                    .                    copy_worksheet                    (                    products_sheet                    )                    <Worksheet "Products Copy">                    >>>                                        workbook                    .                    sheetnames                    ['Products', 'Company Sales', 'Products Copy']                                  

If you open your spreadsheet after saving the above code, you'll notice that the sail Products Copy is a duplicate of the sheet Products.

Freezing Rows and Columns

Something that you might desire to do when working with big spreadsheets is to freeze a few rows or columns, so they remain visible when you scroll right or downward.

Freezing data allows you to continue an middle on of import rows or columns, regardless of where you lot scroll in the spreadsheet.

Again, openpyxl also has a way to accomplish this by using the worksheet freeze_panes aspect. For this example, get back to our sample.xlsx spreadsheet and effort doing the following:

>>>

                                                        >>>                                        workbook                    =                    load_workbook                    (                    filename                    =                    "sample.xlsx"                    )                    >>>                                        sheet                    =                    workbook                    .                    active                    >>>                                        sheet                    .                    freeze_panes                    =                    "C2"                    >>>                                        workbook                    .                    save                    (                    "sample_frozen.xlsx"                    )                                  

If you open the sample_frozen.xlsx spreadsheet in your favorite spreadsheet editor, you lot'll notice that row 1 and columns A and B are frozen and are always visible no matter where you lot navigate inside the spreadsheet.

This feature is handy, for example, to go on headers within sight, and then you ever know what each column represents.

Hither's how it looks in the editor:

Example Spreadsheet With Frozen Rows and Columns

Notice how you're at the cease of the spreadsheet, and yet, you can see both row 1 and columns A and B.

Adding Filters

You tin can use openpyxl to add together filters and sorts to your spreadsheet. Nonetheless, when you open the spreadsheet, the data won't be rearranged according to these sorts and filters.

At offset, this might seem like a pretty useless feature, but when you're programmatically creating a spreadsheet that is going to be sent and used by somebody else, it's still overnice to at to the lowest degree create the filters and allow people to use information technology afterward.

The code below is an example of how you would add some filters to our existing sample.xlsx spreadsheet:

>>>

                                                        >>>                                        # Cheque the used spreadsheet space using the aspect "dimensions"                    >>>                                        canvass                    .                    dimensions                    'A1:O100'                    >>>                                        sheet                    .                    auto_filter                    .                    ref                    =                    "A1:O100"                    >>>                                        workbook                    .                    salvage                    (                    filename                    =                    "sample_with_filters.xlsx"                    )                                  

Y'all should now see the filters created when opening the spreadsheet in your editor:

Example Spreadsheet With Filters

You don't have to use sheet.dimensions if yous know precisely which part of the spreadsheet you want to apply filters to.

Adding Formulas

Formulas (or formulae) are i of the well-nigh powerful features of spreadsheets.

They gives you the power to apply specific mathematical equations to a range of cells. Using formulas with openpyxl is as simple as editing the value of a cell.

You can see the list of formulas supported past openpyxl:

>>>

                                                  >>>                                    from                  openpyxl.utils                  import                  FORMULAE                  >>>                                    FORMULAE                  frozenset({'ABS',                                      'ACCRINT',                                      'ACCRINTM',                                      'ACOS',                                      'ACOSH',                                      'AMORDEGRC',                                      'AMORLINC',                                      'AND',                                      ...                                      'YEARFRAC',                                      'YIELD',                                      'YIELDDISC',                                      'YIELDMAT',                                      'ZTEST'})                              

Let'southward add some formulas to our sample.xlsx spreadsheet.

Starting with something easy, allow's cheque the boilerplate star rating for the 99 reviews within the spreadsheet:

>>>

                                                  >>>                                    # Star rating is cavalcade "H"                  >>>                                    sail                  [                  "P2"                  ]                  =                  "=Boilerplate(H2:H100)"                  >>>                                    workbook                  .                  salve                  (                  filename                  =                  "sample_formulas.xlsx"                  )                              

If you open up the spreadsheet now and go to cell P2, you should see that its value is: 4.18181818181818. Have a wait in the editor:

Example Spreadsheet With Average Formula

You can use the same methodology to add together whatever formulas to your spreadsheet. For example, let'southward count the number of reviews that had helpful votes:

>>>

                                                  >>>                                    # The helpful votes are counted on column "I"                  >>>                                    canvas                  [                  "P3"                  ]                  =                  '=COUNTIF(I2:I100, ">0")'                  >>>                                    workbook                  .                  salvage                  (                  filename                  =                  "sample_formulas.xlsx"                  )                              

You lot should get the number 21 on your P3 spreadsheet cell similar then:

Example Spreadsheet With Average and CountIf Formula

You'll have to make sure that the strings inside a formula are always in double quotes, then you either have to use single quotes around the formula like in the example above or you'll accept to escape the double quotes within the formula: "=COUNTIF(I2:I100, \">0\")".

There are a ton of other formulas you tin add to your spreadsheet using the same procedure you tried to a higher place. Give it a go yourself!

Adding Styles

Even though styling a spreadsheet might not be something y'all would exercise every twenty-four hour period, it'south nonetheless good to know how to do it.

Using openpyxl, y'all can apply multiple styling options to your spreadsheet, including fonts, borders, colors, and so on. Take a await at the openpyxl documentation to learn more than.

You tin also choose to either apply a way direct to a jail cell or create a template and reuse information technology to utilize styles to multiple cells.

Allow's start by having a expect at simple jail cell styling, using our sample.xlsx again as the base spreadsheet:

>>>

                                                  >>>                                    # Import necessary style classes                  >>>                                    from                  openpyxl.styles                  import                  Font                  ,                  Color                  ,                  Alignment                  ,                  Border                  ,                  Side                  >>>                                    # Create a few styles                  >>>                                    bold_font                  =                  Font                  (                  bold                  =                  True                  )                  >>>                                    big_red_text                  =                  Font                  (                  color                  =                  "00FF0000"                  ,                  size                  =                  twenty                  )                  >>>                                    center_aligned_text                  =                  Alignment                  (                  horizontal                  =                  "centre"                  )                  >>>                                    double_border_side                  =                  Side                  (                  border_style                  =                  "double"                  )                  >>>                                    square_border                  =                  Border                  (                  top                  =                  double_border_side                  ,                  ...                                    right                  =                  double_border_side                  ,                  ...                                    lesser                  =                  double_border_side                  ,                  ...                                    left                  =                  double_border_side                  )                  >>>                                    # Style some cells!                  >>>                                    sheet                  [                  "A2"                  ]                  .                  font                  =                  bold_font                  >>>                                    sheet                  [                  "A3"                  ]                  .                  font                  =                  big_red_text                  >>>                                    canvas                  [                  "A4"                  ]                  .                  alignment                  =                  center_aligned_text                  >>>                                    sail                  [                  "A5"                  ]                  .                  border                  =                  square_border                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_styles.xlsx"                  )                              

If you open your spreadsheet at present, you should come across quite a few dissimilar styles on the first v cells of cavalcade A:

Example Spreadsheet With Simple Cell Styles

There you go. You lot got:

  • A2 with the text in assuming
  • A3 with the text in red and bigger font size
  • A4 with the text centered
  • A5 with a square border around the text

You lot can also combine styles by simply adding them to the cell at the aforementioned time:

>>>

                                                  >>>                                    # Reusing the same styles from the example above                  >>>                                    sheet                  [                  "A6"                  ]                  .                  alignment                  =                  center_aligned_text                  >>>                                    canvas                  [                  "A6"                  ]                  .                  font                  =                  big_red_text                  >>>                                    sheet                  [                  "A6"                  ]                  .                  border                  =                  square_border                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_styles.xlsx"                  )                              

Have a look at cell A6 here:

Example Spreadsheet With Coupled Cell Styles

When you want to apply multiple styles to one or several cells, you can employ a NamedStyle class instead, which is similar a style template that you can utilise over and once more. Have a look at the instance beneath:

>>>

                                                  >>>                                    from                  openpyxl.styles                  import                  NamedStyle                  >>>                                    # Permit'southward create a style template for the header row                  >>>                                    header                  =                  NamedStyle                  (                  name                  =                  "header"                  )                  >>>                                    header                  .                  font                  =                  Font                  (                  bold                  =                  True                  )                  >>>                                    header                  .                  border                  =                  Border                  (                  bottom                  =                  Side                  (                  border_style                  =                  "sparse"                  ))                  >>>                                    header                  .                  alignment                  =                  Alignment                  (                  horizontal                  =                  "middle"                  ,                  vertical                  =                  "eye"                  )                  >>>                                    # Now let'south use this to all first row (header) cells                  >>>                                    header_row                  =                  sheet                  [                  1                  ]                  >>>                                    for                  cell                  in                  header_row                  :                  ...                                    cell                  .                  style                  =                  header                  >>>                                    workbook                  .                  relieve                  (                  filename                  =                  "sample_styles.xlsx"                  )                              

If you lot open the spreadsheet now, y'all should see that its commencement row is bold, the text is aligned to the heart, and there's a small bottom edge! Have a look below:

Example Spreadsheet With Named Styles

As you saw to a higher place, at that place are many options when it comes to styling, and it depends on the use case, so feel free to cheque openpyxl documentation and see what other things you tin do.

Conditional Formatting

This feature is i of my personal favorites when it comes to adding styles to a spreadsheet.

It's a much more powerful arroyo to styling because information technology dynamically applies styles according to how the data in the spreadsheet changes.

In a nutshell, provisional formatting allows y'all to specify a list of styles to use to a cell (or cell range) according to specific conditions.

For example, a widespread use case is to have a remainder canvass where all the negative totals are in red, and the positive ones are in green. This formatting makes it much more efficient to spot good vs bad periods.

Without further ado, let's selection our favorite spreadsheet—sample.xlsx—and add some provisional formatting.

You can commencement past adding a unproblematic i that adds a red background to all reviews with less than three stars:

>>>

                                                  >>>                                    from                  openpyxl.styles                  import                  PatternFill                  >>>                                    from                  openpyxl.styles.differential                  import                  DifferentialStyle                  >>>                                    from                  openpyxl.formatting.rule                  import                  Dominion                  >>>                                    red_background                  =                  PatternFill                  (                  fgColor                  =                  "00FF0000"                  )                  >>>                                    diff_style                  =                  DifferentialStyle                  (                  fill                  =                  red_background                  )                  >>>                                    dominion                  =                  Rule                  (                  type                  =                  "expression"                  ,                  dxf                  =                  diff_style                  )                  >>>                                    rule                  .                  formula                  =                  [                  "$H1<3"                  ]                  >>>                                    canvass                  .                  conditional_formatting                  .                  add                  (                  "A1:O100"                  ,                  rule                  )                  >>>                                    workbook                  .                  save                  (                  "sample_conditional_formatting.xlsx"                  )                              

At present you'll see all the reviews with a star rating below 3 marked with a red background:

Example Spreadsheet With Simple Conditional Formatting

Code-wise, the simply things that are new here are the objects DifferentialStyle and Rule:

  • DifferentialStyle is quite similar to NamedStyle, which you already saw above, and it'southward used to amass multiple styles such equally fonts, borders, alignment, and then forth.
  • Rule is responsible for selecting the cells and applying the styles if the cells lucifer the rule's logic.

Using a Rule object, you can create numerous conditional formatting scenarios.

However, for simplicity sake, the openpyxl packet offers 3 built-in formats that make it easier to create a few common conditional formatting patterns. These congenital-ins are:

  • ColorScale
  • IconSet
  • DataBar

The ColorScale gives you the ability to create colour gradients:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  ColorScaleRule                  >>>                                    color_scale_rule                  =                  ColorScaleRule                  (                  start_type                  =                  "min"                  ,                  ...                                    start_color                  =                  "00FF0000"                  ,                  # Red                  ...                                    end_type                  =                  "max"                  ,                  ...                                    end_color                  =                  "0000FF00"                  )                  # Dark-green                  >>>                                    # Again, let'south add together this gradient to the star ratings, column "H"                  >>>                                    canvass                  .                  conditional_formatting                  .                  add together                  (                  "H2:H100"                  ,                  color_scale_rule                  )                  >>>                                    workbook                  .                  salvage                  (                  filename                  =                  "sample_conditional_formatting_color_scale.xlsx"                  )                              

At present you should see a colour gradient on column H, from red to dark-green, co-ordinate to the star rating:

Example Spreadsheet With Color Scale Conditional Formatting

You tin can also add a 3rd colour and brand two gradients instead:

>>>

                                                  >>>                                    from                  openpyxl.formatting.dominion                  import                  ColorScaleRule                  >>>                                    color_scale_rule                  =                  ColorScaleRule                  (                  start_type                  =                  "num"                  ,                  ...                                    start_value                  =                  1                  ,                  ...                                    start_color                  =                  "00FF0000"                  ,                  # Scarlet                  ...                                    mid_type                  =                  "num"                  ,                  ...                                    mid_value                  =                  3                  ,                  ...                                    mid_color                  =                  "00FFFF00"                  ,                  # Yellow                  ...                                    end_type                  =                  "num"                  ,                  ...                                    end_value                  =                  v                  ,                  ...                                    end_color                  =                  "0000FF00"                  )                  # Dark-green                  >>>                                    # Again, let's add this gradient to the star ratings, cavalcade "H"                  >>>                                    sheet                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  color_scale_rule                  )                  >>>                                    workbook                  .                  save                  (                  filename                  =                  "sample_conditional_formatting_color_scale_3.xlsx"                  )                              

This time, yous'll notice that star ratings between one and 3 have a slope from cherry to yellow, and star ratings betwixt 3 and 5 have a gradient from yellowish to green:

Example Spreadsheet With 2 Color Scales Conditional Formatting

The IconSet allows you to add an icon to the cell according to its value:

>>>

                                                  >>>                                    from                  openpyxl.formatting.dominion                  import                  IconSetRule                  >>>                                    icon_set_rule                  =                  IconSetRule                  (                  "5Arrows"                  ,                  "num"                  ,                  [                  one                  ,                  two                  ,                  3                  ,                  4                  ,                  five                  ])                  >>>                                    sail                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  icon_set_rule                  )                  >>>                                    workbook                  .                  save                  (                  "sample_conditional_formatting_icon_set.xlsx"                  )                              

Yous'll see a colored arrow next to the star rating. This arrow is red and points downwardly when the value of the jail cell is one and, every bit the rating gets better, the pointer starts pointing up and becomes dark-green:

Example Spreadsheet With Icon Set Conditional Formatting

The openpyxl packet has a full list of other icons you can use, too the arrow.

Finally, the DataBar allows you to create progress confined:

>>>

                                                  >>>                                    from                  openpyxl.formatting.rule                  import                  DataBarRule                  >>>                                    data_bar_rule                  =                  DataBarRule                  (                  start_type                  =                  "num"                  ,                  ...                                    start_value                  =                  i                  ,                  ...                                    end_type                  =                  "num"                  ,                  ...                                    end_value                  =                  "five"                  ,                  ...                                    color                  =                  "0000FF00"                  )                  # Light-green                  >>>                                    canvas                  .                  conditional_formatting                  .                  add                  (                  "H2:H100"                  ,                  data_bar_rule                  )                  >>>                                    workbook                  .                  save                  (                  "sample_conditional_formatting_data_bar.xlsx"                  )                              

You'll now run across a green progress bar that gets fuller the closer the star rating is to the number 5:

Example Spreadsheet With Data Bar Conditional Formatting

Every bit you can see, there are a lot of cool things you can do with provisional formatting.

Here, yous saw only a few examples of what yous can achieve with it, just cheque the openpyxl documentation to see a bunch of other options.

Adding Images

Fifty-fifty though images are not something that y'all'll oftentimes see in a spreadsheet, it's quite cool to be able to add them. Perchance you can use it for branding purposes or to brand spreadsheets more personal.

To be able to load images to a spreadsheet using openpyxl, you'll have to install Pillow:

Apart from that, y'all'll besides need an image. For this instance, you lot tin catch the Real Python logo below and catechumen it from .webp to .png using an online converter such as cloudconvert.com, salvage the last file as logo.png, and copy information technology to the root folder where yous're running your examples:

Real Python Logo

Afterward, this is the code y'all need to import that image into the hello_word.xlsx spreadsheet:

                                                  from                  openpyxl                  import                  load_workbook                  from                  openpyxl.cartoon.image                  import                  Prototype                  # Let's utilise the hello_world spreadsheet since information technology has less data                  workbook                  =                  load_workbook                  (                  filename                  =                  "hello_world.xlsx"                  )                  canvas                  =                  workbook                  .                  agile                  logo                  =                  Image                  (                  "logo.png"                  )                  # A bit of resizing to not make full the whole spreadsheet with the logo                  logo                  .                  tiptop                  =                  150                  logo                  .                  width                  =                  150                  sheet                  .                  add_image                  (                  logo                  ,                  "A3"                  )                  workbook                  .                  salvage                  (                  filename                  =                  "hello_world_logo.xlsx"                  )                              

You take an image on your spreadsheet! Here it is:

Example Spreadsheet With Image

The paradigm's left top corner is on the prison cell you chose, in this instance, A3.

Adding Pretty Charts

Another powerful thing you tin practice with spreadsheets is create an incredible variety of charts.

Charts are a corking mode to visualize and understand loads of data rapidly. In that location are a lot of unlike chart types: bar chart, pie nautical chart, line nautical chart, and and so on. openpyxl has support for a lot of them.

Here, you'll run into only a couple of examples of charts considering the theory behind it is the same for every single chart type:

For any chart you want to build, you'll demand to define the chart type: BarChart, LineChart, and and then forth, plus the data to be used for the chart, which is called Reference.

Before you tin can build your nautical chart, you need to define what data y'all want to run into represented in information technology. Sometimes, you can utilise the dataset as is, but other times you need to massage the data a chip to get additional information.

Let's first by edifice a new workbook with some sample data:

                                                                      1                  from                  openpyxl                  import                  Workbook                                      2                  from                  openpyxl.chart                  import                  BarChart                  ,                  Reference                                      3                                      iv                  workbook                  =                  Workbook                  ()                                      5                  sheet                  =                  workbook                  .                  active                                      6                                      7                  # Permit's create some sample sales data                                      8                  rows                  =                  [                                      nine                  [                  "Product"                  ,                  "Online"                  ,                  "Shop"                  ],                  10                  [                  1                  ,                  xxx                  ,                  45                  ],                  11                  [                  two                  ,                  40                  ,                  30                  ],                  12                  [                  3                  ,                  40                  ,                  25                  ],                  xiii                  [                  4                  ,                  50                  ,                  xxx                  ],                  14                  [                  5                  ,                  thirty                  ,                  25                  ],                  15                  [                  6                  ,                  25                  ,                  35                  ],                  16                  [                  7                  ,                  20                  ,                  40                  ],                  17                  ]                  18                  19                  for                  row                  in                  rows                  :                  20                  sheet                  .                  suspend                  (                  row                  )                              

Now you're going to first by creating a bar chart that displays the total number of sales per product:

                                                  22                  chart                  =                  BarChart                  ()                  23                  data                  =                  Reference                  (                  worksheet                  =                  sheet                  ,                  24                  min_row                  =                  1                  ,                  25                  max_row                  =                  8                  ,                  26                  min_col                  =                  2                  ,                  27                  max_col                  =                  3                  )                  28                  29                  chart                  .                  add_data                  (                  data                  ,                  titles_from_data                  =                  True                  )                  30                  sheet                  .                  add_chart                  (                  chart                  ,                  "E2"                  )                  31                  32                  workbook                  .                  save                  (                  "chart.xlsx"                  )                              

There you have it. Below, you tin can see a very straightforward bar chart showing the difference between online production sales online and in-shop production sales:

Example Spreadsheet With Bar Chart

Similar with images, the summit left corner of the chart is on the prison cell yous added the chart to. In your case, information technology was on jail cell E2.

Try creating a line chart instead, changing the data a bit:

                                                                      1                  import                  random                                      two                  from                  openpyxl                  import                  Workbook                                      iii                  from                  openpyxl.nautical chart                  import                  LineChart                  ,                  Reference                                      4                                      5                  workbook                  =                  Workbook                  ()                                      6                  sheet                  =                  workbook                  .                  agile                                      7                                      eight                  # Allow's create some sample sales data                                      9                  rows                  =                  [                  10                  [                  ""                  ,                  "January"                  ,                  "February"                  ,                  "March"                  ,                  "April"                  ,                  11                  "May"                  ,                  "June"                  ,                  "July"                  ,                  "Baronial"                  ,                  "September"                  ,                  12                  "October"                  ,                  "Nov"                  ,                  "December"                  ],                  13                  [                  1                  ,                  ],                  14                  [                  ii                  ,                  ],                  fifteen                  [                  3                  ,                  ],                  16                  ]                  17                  18                  for                  row                  in                  rows                  :                  xix                  sail                  .                  append                  (                  row                  )                  20                  21                  for                  row                  in                  sheet                  .                  iter_rows                  (                  min_row                  =                  ii                  ,                  22                  max_row                  =                  4                  ,                  23                  min_col                  =                  2                  ,                  24                  max_col                  =                  xiii                  ):                  25                  for                  jail cell                  in                  row                  :                  26                  cell                  .                  value                  =                  random                  .                  randrange                  (                  v                  ,                  100                  )                              

With the above code, you'll exist able to generate some random data regarding the sales of three dissimilar products across a whole year.

One time that's done, you can very easily create a line chart with the following code:

                                                  28                  nautical chart                  =                  LineChart                  ()                  29                  data                  =                  Reference                  (                  worksheet                  =                  sheet                  ,                  thirty                  min_row                  =                  2                  ,                  31                  max_row                  =                  4                  ,                  32                  min_col                  =                  1                  ,                  33                  max_col                  =                  13                  )                  34                  35                  nautical chart                  .                  add_data                  (                  data                  ,                  from_rows                  =                  True                  ,                  titles_from_data                  =                  Truthful                  )                  36                  sheet                  .                  add_chart                  (                  chart                  ,                  "C6"                  )                  37                  38                  workbook                  .                  save                  (                  "line_chart.xlsx"                  )                              

Hither's the consequence of the above piece of lawmaking:

Example Spreadsheet With Line Chart

One thing to keep in mind hither is the fact that you're using from_rows=True when adding the data. This argument makes the chart plot row by row instead of column past cavalcade.

In your sample information, you see that each production has a row with 12 values (1 cavalcade per month). That'south why you lot employ from_rows. If you don't laissez passer that argument, past default, the chart tries to plot by column, and you'll go a month-by-month comparison of sales.

Another divergence that has to exercise with the above argument change is the fact that our Reference at present starts from the first column, min_col=1, instead of the 2d one. This alter is needed because the chart now expects the first cavalcade to have the titles.

There are a couple of other things y'all can also change regarding the style of the chart. For example, y'all can add specific categories to the chart:

                                                  cats                  =                  Reference                  (                  worksheet                  =                  canvass                  ,                  min_row                  =                  ane                  ,                  max_row                  =                  1                  ,                  min_col                  =                  two                  ,                  max_col                  =                  13                  )                  chart                  .                  set_categories                  (                  cats                  )                              

Add together this piece of code earlier saving the workbook, and you should see the month names appearing instead of numbers:

Example Spreadsheet With Line Chart and Categories

Code-wise, this is a minimal change. Only in terms of the readability of the spreadsheet, this makes information technology much easier for someone to open the spreadsheet and understand the chart directly away.

Some other thing you tin can do to improve the chart readability is to add together an axis. You tin can do information technology using the attributes x_axis and y_axis:

                                                  nautical chart                  .                  x_axis                  .                  championship                  =                  "Months"                  chart                  .                  y_axis                  .                  title                  =                  "Sales (per unit)"                              

This will generate a spreadsheet like the beneath one:

Example Spreadsheet With Line Chart, Categories and Axis Titles

Equally you lot can run across, small changes like the above make reading your chart a much easier and quicker task.

There is also a way to style your chart by using Excel'south default ChartStyle belongings. In this case, you accept to choose a number betwixt ane and 48. Depending on your selection, the colors of your chart modify equally well:

                                                  # You lot tin can play with this past choosing any number between i and 48                  chart                  .                  manner                  =                  24                              

With the way selected above, all lines have some shade of orange:

Example Spreadsheet With Line Chart, Categories, Axis Titles and Style

There is no clear documentation on what each style number looks similar, just this spreadsheet has a few examples of the styles available.

Here's the full code used to generate the line chart with categories, axis titles, and style:

                                                        import                    random                    from                    openpyxl                    import                    Workbook                    from                    openpyxl.nautical chart                    import                    LineChart                    ,                    Reference                    workbook                    =                    Workbook                    ()                    sheet                    =                    workbook                    .                    active                    # Let's create some sample sales information                    rows                    =                    [                    [                    ""                    ,                    "Jan"                    ,                    "February"                    ,                    "March"                    ,                    "April"                    ,                    "May"                    ,                    "June"                    ,                    "July"                    ,                    "Baronial"                    ,                    "September"                    ,                    "October"                    ,                    "November"                    ,                    "December"                    ],                    [                    1                    ,                    ],                    [                    two                    ,                    ],                    [                    iii                    ,                    ],                    ]                    for                    row                    in                    rows                    :                    sheet                    .                    append                    (                    row                    )                    for                    row                    in                    sail                    .                    iter_rows                    (                    min_row                    =                    two                    ,                    max_row                    =                    4                    ,                    min_col                    =                    two                    ,                    max_col                    =                    13                    ):                    for                    cell                    in                    row                    :                    prison cell                    .                    value                    =                    random                    .                    randrange                    (                    v                    ,                    100                    )                    # Create a LineChart and add the main information                    chart                    =                    LineChart                    ()                    data                    =                    Reference                    (                    worksheet                    =                    sheet                    ,                    min_row                    =                    ii                    ,                    max_row                    =                    4                    ,                    min_col                    =                    i                    ,                    max_col                    =                    13                    )                    chart                    .                    add_data                    (                    data                    ,                    titles_from_data                    =                    True                    ,                    from_rows                    =                    Truthful                    )                    # Add categories to the chart                    cats                    =                    Reference                    (                    worksheet                    =                    sheet                    ,                    min_row                    =                    1                    ,                    max_row                    =                    1                    ,                    min_col                    =                    2                    ,                    max_col                    =                    13                    )                    nautical chart                    .                    set_categories                    (                    cats                    )                    # Rename the X and Y Axis                    chart                    .                    x_axis                    .                    title                    =                    "Months"                    chart                    .                    y_axis                    .                    title                    =                    "Sales (per unit)"                    # Apply a specific Style                    nautical chart                    .                    way                    =                    24                    # Save!                    sheet                    .                    add_chart                    (                    nautical chart                    ,                    "C6"                    )                    workbook                    .                    save                    (                    "line_chart.xlsx"                    )                                  

In that location are a lot more chart types and customization y'all can apply, then be sure to cheque out the parcel documentation on this if you need some specific formatting.

Catechumen Python Classes to Excel Spreadsheet

You already saw how to convert an Excel spreadsheet'south data into Python classes, merely now let's exercise the opposite.

Permit'southward imagine you have a database and are using some Object-Relational Mapping (ORM) to map DB objects into Python classes. Now, you want to consign those same objects into a spreadsheet.

Permit's assume the following data classes to represent the data coming from your database regarding product sales:

                                                  from                  dataclasses                  import                  dataclass                  from                  typing                  import                  List                  @dataclass                  grade                  Sale                  :                  quantity                  :                  int                  @dataclass                  course                  Product                  :                  id                  :                  str                  name                  :                  str                  sales                  :                  List                  [                  Sale                  ]                              

At present, let's generate some random data, assuming the above classes are stored in a db_classes.py file:

                                                                      i                  import                  random                                      two                                      3                  # Ignore these for now. You'll apply them in a sec ;)                                      4                  from                  openpyxl                  import                  Workbook                                      5                  from                  openpyxl.chart                  import                  LineChart                  ,                  Reference                                      6                                      vii                  from                  db_classes                  import                  Product                  ,                  Sale                                      eight                                      ix                  products                  =                  []                  10                  11                  # Let's create v products                  12                  for                  idx                  in                  range                  (                  1                  ,                  six                  ):                  thirteen                  sales                  =                  []                  14                  15                  # Create v months of sales                  16                  for                  _                  in                  range                  (                  5                  ):                  17                  sale                  =                  Sale                  (                  quantity                  =                  random                  .                  randrange                  (                  5                  ,                  100                  ))                  18                  sales                  .                  append                  (                  sale                  )                  19                  twenty                  product                  =                  Product                  (                  id                  =                  str                  (                  idx                  ),                  21                  proper name                  =                  "Product                                    %s                  "                  %                  idx                  ,                  22                  sales                  =                  sales                  )                  23                  products                  .                  append                  (                  production                  )                              

Past running this piece of code, y'all should become 5 products with 5 months of sales with a random quantity of sales for each month.

At present, to catechumen this into a spreadsheet, you need to iterate over the data and append it to the spreadsheet:

                                                  25                  workbook                  =                  Workbook                  ()                  26                  canvas                  =                  workbook                  .                  active                  27                  28                  # Append column names get-go                  29                  sheet                  .                  append                  ([                  "Product ID"                  ,                  "Production Name"                  ,                  "Month one"                  ,                  30                  "Calendar month two"                  ,                  "Month 3"                  ,                  "Month 4"                  ,                  "Calendar month 5"                  ])                  31                  32                  # Append the data                  33                  for                  product                  in                  products                  :                  34                  data                  =                  [                  production                  .                  id                  ,                  product                  .                  name                  ]                  35                  for                  sale                  in                  product                  .                  sales                  :                  36                  data                  .                  suspend                  (                  sale                  .                  quantity                  )                  37                  sheet                  .                  append                  (                  data                  )                              

That'south it. That should allow you to create a spreadsheet with some data coming from your database.

Even so, why non use some of that cool knowledge y'all gained recently to add a chart as well to display that information more than visually?

All right, then you could probably practise something similar this:

                                                  38                  chart                  =                  LineChart                  ()                  39                  data                  =                  Reference                  (                  worksheet                  =                  sheet                  ,                  40                  min_row                  =                  2                  ,                  41                  max_row                  =                  half-dozen                  ,                  42                  min_col                  =                  2                  ,                  43                  max_col                  =                  7                  )                  44                  45                  chart                  .                  add_data                  (                  data                  ,                  titles_from_data                  =                  True                  ,                  from_rows                  =                  True                  )                  46                  sheet                  .                  add_chart                  (                  nautical chart                  ,                  "B8"                  )                  47                  48                  cats                  =                  Reference                  (                  worksheet                  =                  canvass                  ,                  49                  min_row                  =                  1                  ,                  50                  max_row                  =                  1                  ,                  51                  min_col                  =                  3                  ,                  52                  max_col                  =                  vii                  )                  53                  chart                  .                  set_categories                  (                  cats                  )                  54                  55                  chart                  .                  x_axis                  .                  title                  =                  "Months"                  56                  nautical chart                  .                  y_axis                  .                  title                  =                  "Sales (per unit of measurement)"                  57                  58                  workbook                  .                  save                  (                  filename                  =                  "oop_sample.xlsx"                  )                              

At present we're talking! Here's a spreadsheet generated from database objects and with a chart and everything:

Example Spreadsheet With Conversion from Python Data Classes

That'south a nifty style for you to wrap up your new knowledge of charts!

Bonus: Working With Pandas

Even though y'all can use Pandas to handle Excel files, there are few things that you either can't accomplish with Pandas or that yous'd be better off just using openpyxl straight.

For instance, some of the advantages of using openpyxl are the power to hands customize your spreadsheet with styles, conditional formatting, and such.

But guess what, yous don't take to worry about picking. In fact, openpyxl has support for both converting information from a Pandas DataFrame into a workbook or the reverse, converting an openpyxl workbook into a Pandas DataFrame.

Showtime things outset, remember to install the pandas package:

Then, let's create a sample DataFrame:

                                                                      one                  import                  pandas                  as                  pd                                      2                                      3                  data                  =                  {                                      iv                  "Product Name"                  :                  [                  "Product 1"                  ,                  "Product 2"                  ],                                      v                  "Sales Calendar month 1"                  :                  [                  ten                  ,                  20                  ],                                      6                  "Sales Month 2"                  :                  [                  5                  ,                  35                  ],                                      7                  }                                      viii                  df                  =                  pd                  .                  DataFrame                  (                  information                  )                              

Now that y'all have some information, y'all can use .dataframe_to_rows() to catechumen information technology from a DataFrame into a worksheet:

                                                  10                  from                  openpyxl                  import                  Workbook                  11                  from                  openpyxl.utils.dataframe                  import                  dataframe_to_rows                  12                  13                  workbook                  =                  Workbook                  ()                  xiv                  sheet                  =                  workbook                  .                  active                  xv                  16                  for                  row                  in                  dataframe_to_rows                  (                  df                  ,                  index                  =                  False                  ,                  header                  =                  True                  ):                  17                  sheet                  .                  append                  (                  row                  )                  18                  19                  workbook                  .                  save                  (                  "pandas.xlsx"                  )                              

You should see a spreadsheet that looks like this:

Example Spreadsheet With Data from Pandas Data Frame

If you want to add together the DataFrame'south index, y'all can change index=True, and it adds each row's alphabetize into your spreadsheet.

On the other hand, if you desire to catechumen a spreadsheet into a DataFrame, you lot tin can also do information technology in a very straightforward fashion like so:

                                                  import                  pandas                  equally                  pd                  from                  openpyxl                  import                  load_workbook                  workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  sheet                  =                  workbook                  .                  active                  values                  =                  sheet                  .                  values                  df                  =                  pd                  .                  DataFrame                  (                  values                  )                              

Alternatively, if you desire to add the correct headers and employ the review ID every bit the index, for case, then you can as well practise it similar this instead:

                                                  import                  pandas                  as                  pd                  from                  openpyxl                  import                  load_workbook                  from                  mapping                  import                  REVIEW_ID                  workbook                  =                  load_workbook                  (                  filename                  =                  "sample.xlsx"                  )                  sheet                  =                  workbook                  .                  active                  data                  =                  canvass                  .                  values                  # Fix the outset row every bit the columns for the DataFrame                  cols                  =                  next                  (                  information                  )                  data                  =                  list                  (                  data                  )                  # Set the field "review_id" every bit the indexes for each row                  idx                  =                  [                  row                  [                  REVIEW_ID                  ]                  for                  row                  in                  data                  ]                  df                  =                  pd                  .                  DataFrame                  (                  data                  ,                  alphabetize                  =                  idx                  ,                  columns                  =                  cols                  )                              

Using indexes and columns allows you lot to access data from your DataFrame easily:

>>>

                                                  >>>                                    df                  .                  columns                  Alphabetize(['marketplace', 'customer_id', 'review_id', 'product_id',                                      'product_parent', 'product_title', 'product_category', 'star_rating',                                      'helpful_votes', 'total_votes', 'vine', 'verified_purchase',                                      'review_headline', 'review_body', 'review_date'],                                      dtype='object')                  >>>                                    # Get first x reviews' star rating                  >>>                                    df                  [                  "star_rating"                  ][:                  x                  ]                  R3O9SGZBVQBV76    5                  RKH8BNC3L5DLF     five                  R2HLE8WKZSU3NL    ii                  R31U3UH5AZ42LL    5                  R2SV659OUJ945Y    iv                  RA51CP8TR5A2L     5                  RB2Q7DLDN6TH6     v                  R2RHFJV0UYBK3Y    1                  R2Z6JOQ94LFHEP    5                  RX27XIIWY5JPB     4                  Name: star_rating, dtype: int64                  >>>                                    # Grab review with id "R2EQL1V1L6E0C9", using the index                  >>>                                    df                  .                  loc                  [                  "R2EQL1V1L6E0C9"                  ]                  market               U.s.                  customer_id         15305006                  review_id     R2EQL1V1L6E0C9                  product_id        B004LURNO6                  product_parent     892860326                  review_headline   Five Stars                  review_body          Dearest it                  review_date       2015-08-31                  Name: R2EQL1V1L6E0C9, dtype: object                              

There you become, whether you want to use openpyxl to prettify your Pandas dataset or use Pandas to do some hardcore algebra, you at present know how to switch between both packages.

Decision

Phew, after that long read, y'all now know how to work with spreadsheets in Python! You tin rely on openpyxl, your trustworthy companion, to:

  • Extract valuable information from spreadsheets in a Pythonic manner
  • Create your own spreadsheets, no matter the complexity level
  • Add together cool features such as conditional formatting or charts to your spreadsheets

There are a few other things you lot tin can do with openpyxl that might non accept been covered in this tutorial, but you can always bank check the packet'southward official documentation website to learn more about it. Y'all can even venture into checking its source lawmaking and improving the parcel further.

Experience gratuitous to go out whatever comments below if yous have any questions, or if in that location's any department you'd dear to hear more about.

Watch At present This tutorial has a related video course created by the Real Python squad. Watch it together with the written tutorial to deepen your agreement: Editing Excel Spreadsheets in Python With openpyxl