admin管理员组文章数量:1122832
I'm trying to develop a simple app in Python Flet for displaying each page of a Pdf file. The code imports the pypdf library for PDF management. The UI consists of a button for loading the first page of the PDF and for skipping to the next page, and of a Flet container whose content is a Flet image. Flet image takes a Base64 encoded string which should correspond, in turn, to every single page of the PDF.
import flet as ft
import pypdf
def main(page: ft.Page):
def btn_Click(e):
cont.content = ft.Image(src_base64 = reader.pages[0],
fit=ft.ImageFit.FILL,
)
page.update()
if btn.data < len(reader.pages):
btn.data +=1
reader = pypdf.PdfReader('Your Pdf filename.pdf')
cont = ft.Container(height = 0.8*page.height,
width = 0.4 * page.width,
border=ft.border.all(3, ft.colors.RED),)
btn = ft.IconButton(
icon=ft.icons.UPLOAD_FILE,
on_click=btn_Click,
icon_size=35,
data=0,)
page.add(ft.Column([cont, btn], horizontal_alignment="center"))
page.horizontal_alignment = "center"
page.scroll = ft.ScrollMode.AUTO
page.update()
ft.app(target=main, assets_dir="assets")
Once the button is clicked, I get this error:
Error decoding base64: FormatException: Invalid character (at character 1)
{'/Contents': [IndirectObject(2286, 0, 1969514531216), IndirectObject(2287,...
^
Searching in the web, I found that this exception already happened with Flutter, from which the framework Flet is derived. See this link, this and this. It was suggested to apply this conversion:
base64.decode(sourceContent.replaceAll(RegExp(r'\s+'), ''))
I don't know how to apply it to the my variable reader. Or, alternatevely, if pypdf contains a method for making this conversion.
I'm trying to develop a simple app in Python Flet for displaying each page of a Pdf file. The code imports the pypdf library for PDF management. The UI consists of a button for loading the first page of the PDF and for skipping to the next page, and of a Flet container whose content is a Flet image. Flet image takes a Base64 encoded string which should correspond, in turn, to every single page of the PDF.
import flet as ft
import pypdf
def main(page: ft.Page):
def btn_Click(e):
cont.content = ft.Image(src_base64 = reader.pages[0],
fit=ft.ImageFit.FILL,
)
page.update()
if btn.data < len(reader.pages):
btn.data +=1
reader = pypdf.PdfReader('Your Pdf filename.pdf')
cont = ft.Container(height = 0.8*page.height,
width = 0.4 * page.width,
border=ft.border.all(3, ft.colors.RED),)
btn = ft.IconButton(
icon=ft.icons.UPLOAD_FILE,
on_click=btn_Click,
icon_size=35,
data=0,)
page.add(ft.Column([cont, btn], horizontal_alignment="center"))
page.horizontal_alignment = "center"
page.scroll = ft.ScrollMode.AUTO
page.update()
ft.app(target=main, assets_dir="assets")
Once the button is clicked, I get this error:
Error decoding base64: FormatException: Invalid character (at character 1)
{'/Contents': [IndirectObject(2286, 0, 1969514531216), IndirectObject(2287,...
^
Searching in the web, I found that this exception already happened with Flutter, from which the framework Flet is derived. See this link, this and this. It was suggested to apply this conversion:
base64.decode(sourceContent.replaceAll(RegExp(r'\s+'), ''))
I don't know how to apply it to the my variable reader. Or, alternatevely, if pypdf contains a method for making this conversion.
Share Improve this question asked Nov 21, 2024 at 15:07 eljambaeljamba 3752 gold badges3 silver badges16 bronze badges 1 |1 Answer
Reset to default 1pypdf does not implement any rendering functionality, thus you cannot apply it to your use-case here. The output you get is the internal representation of the PDF page used by pypdf.
If you just want to display the PDF file inside the web browser, I would probably go with the pdf.js library which does all the rendering on the frontend side.
If you really need image-based output, there are lots of libraries/tools which you could use:
- pdf2image, which is based upon the poppler library
- (Py)MuPDF
- Ghostscript
- pypdfium2
- pyvips
- ...
An example for pdf2image to get the first page as JPG data could look like this:
import base64
from io import BytesIO
import pdf2image
images = pdf2image.convert_from_path('file.pdf', first_page=1, last_page=1, fmt='jpg', use_pdftocairo=True)
image = images[0]
image_buffer = BytesIO()
image.save(image_buffer, format='JPEG')
image_data = base64.b64encode(image_buffer.getvalue())
For further options, like the resolution to use for rendering etc., see the corresponding upstream documentation.
本文标签: pythonHow to convert a pypdf reader object into a base64 stringStack Overflow
版权声明:本文标题:python - How to convert a pypdf reader object into a base64 string - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736309584a1934070.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
pdf2image
based uponpdftocairo
, PyMuPDF, Ghostscript or similar). – epR8GaYuh Commented Nov 22, 2024 at 20:07