How to extract cell format (bold, italic, color, etc.) containing rich text using python?

185 Views Asked by At

I would like to extract font of a cell containing rich text ,which means in same cell there is a text in bold and another text in italic enter image description here, I worked with openpyxl and I tried with bellow code but I got an incorrect result: False

import openpyxl
path="doc.xlsx"
wb_obj = openpyxl.load_workbook(path)
sheet_obj = wb_obj.active
cell_obj = sheet_obj.cell(row = 5, column = 5)
print(cell_obj.font.bold)
1

There are 1 best solutions below

4
moken On BEST ANSWER

Further to the comment A textblock may be a whole word, two or more words or part of the word depending on where the font changes exist. Its also possible you might see the space character [between words] as a separate textblock if the space inherits the default font and not the style of the previous or next word.

For example, given the following text in the cell
enter image description here
we would expect 3 textblocks, one for each word.

To obtain the font details you might want to loop the text blocks using a for loop and print out the font style for each text block as shown;

In the example code Either 'b' or 'bold' can be used to access the textblock bold state;
e.g.
font_style.b
and
font_style.bold
return the status of bolding in the text block. The same goes for italic where either .i or .italic can be used and for underline, .u or .underline can be used.

import openpyxl


path="foo.xlsx"
wb_obj = openpyxl.load_workbook(path, rich_text=True)
sheet_obj = wb_obj.active

cell_obj = sheet_obj.cell(row=1, column=1)

for textblock_item in cell_obj.value:
    ### Check the textblock_item is an Openpyxl RichText TextBlock
    print(f"textblock_item type: '{type(textblock_item)}'")
    ### Type should be 'openpyxl.cell.rich_text.TextBlock'     

    ### Either of the following two lines should print the actual text in the block
    print(f"Block of text: '{textblock_item.text}'")
    print(f"Block of text: '{textblock_item}'")

    font_style = textblock_item.font
    print(f"Font Enhancements; Bold: {font_style.b}, "  
          f"Italic: {font_style.italic}, "
          f"Underline: {font_style.underline}")
    print(f"Font Size: {font_style.size}")
    print(f"Font family: {font_style.rFont}")
    if font_style.color is not None:
        if isinstance(font_style.color.index, str):
            print(f"Font Colour: {font_style.color.rgb}")
        else:
            print(f"Font Colour: Theme {font_style.color.theme}, Tint: {font_style.color.tint}")
    else:
        print(f"Font Colour: None")

    print("-----------------------------------\n")

and the output would be

Block of text: 'bold '
Font Enhancements; Bold: True, Italic: False, Underline: None
Font Size: 10.0
Font family: Calibri
Font Colour: FF00FF00
-----------------------------------

Block of text: 'itallic '
Font Enhancements; Bold: False, Italic: True, Underline: None
Font Size: 11.0
Font family: Arial
Font Colour: FFFF0000
-----------------------------------

Block of text: 'underline '
Font Enhancements; Bold: False, Italic: False, Underline: single
Font Size: 12.0
Font family: Verdana
Font Colour: FF0000FF
-----------------------------------

You can see the trailing spaces for each word is included in the word block but it may be separate as mentioned so you could include a check for that in the loop if its a problem.

As mentioned each block separates the font styles so if in the same example the first word 'bold' had the 'l' only underlined (no other font changes) this would change that textblock into three separate blocks made up of the characters in the word.
Example

enter image description here

Block of text: 'bo'
Font Enhancements; Bold: True, Italic: False, Underline: None
Font Size: 10.0
Font family: Calibri
Font Colour: FF00FF00
-----------------------------------

Block of text: 'l'
Font Enhancements; Bold: True, Italic: False, Underline: single
Font Size: 10.0
Font family: Calibri
Font Colour: FF00FF00
-----------------------------------

Block of text: 'd '
Font Enhancements; Bold: True, Italic: False, Underline: None
Font Size: 10.0
Font family: Calibri
Font Colour: FF00FF00
-----------------------------------

... with the rest of the words as shown previously.