Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

insertText/insertTextbox create flipped text #425

Closed
retsyo opened this issue Dec 31, 2019 · 4 comments
Closed

insertText/insertTextbox create flipped text #425

retsyo opened this issue Dec 31, 2019 · 4 comments
Assignees
Labels

Comments

@retsyo
Copy link

retsyo commented Dec 31, 2019

1. I met one or two bug

I use PyMuPDF-1.16.10-cp36-none-win_amd64.whl, which is built on Dec 22, 2019, with Python3.6.

I try to add page number on pdf page. The prefer text postion is the center of the bottom. The initial code seems very clean

For some of my PDF files, the code works. But for some, no. For example blank.pdf, which is exported from a blank file from Typora 0.9.81(beta) on Windows 7 64 bits, and is available on my github.

bug 1

The text, which should be "page 1", is flipped no matter I use insertText or insertTextbox as you can see on flipped_text.png .

bug 2

And what is more, the text is not located around the center.

So how to fix it/them?

import fitz

fnPdfIn, fnPdfOut = 'blank.pdf', 'out.pdf'

doc2 = fitz.open()
pdfIn = fitz.open(fnPdfIn)
doc2.insertPDF(pdfIn)

for idxPage in range(len(doc2)):
    page = doc2[idxPage]
    _, _, wid, hi = page.MediaBox

    text =  'page %i' % (idxPage+1)

    # method 1: insertText
    #~ where = fitz.Point(wid*0.5, hi-10)
    #~ page.insertText(where, text, fontsize=fontsize, )

    # method 2: insertTextbox
    rect = fitz.Rect(
        wid/2-fontsize*len(text)/2, hi/2-fontsize/2,
        wid/2+fontsize*len(text)/2, hi/2+fontsize/2
    )
    page.drawRect(rect, color = (0.5, 0.5, 0.5), overlay=True)
    page.insertTextbox(rect, text, fontsize=fontsize)

doc2.save(fnPdfOut)
doc2.close()

2. This is not a bug

how to make the center of the text located exactly the center of the bottom? In my case, some unicode characters are in text

3. This is not a bug

'new times roman' will be used for English letters and numbers, a unicode font( for example, "C:\Windows\Fonts\simsun.ttc") will be used for unicode characters
what is the easy way to assign font for the text?

thanks

@JorjMcKie
Copy link
Collaborator

Re: 1 This is not a bug either!

Most probably the geometry of your page has been changed. See the documentation https://pymupdf.readthedocs.io/en/latest/faq/#misplaced-item-insertions-on-pdf-pages for how to best detect and handle this.

Re 2 How to center text

First, you need to define a rectangle within which the page number should occur. For example, define the rectangle height as 30 and let rect be the page rectangle: bottom_rect = fitz.Rect(0, rect.height - 30, rect.width, rect.height).
Then do page.insertTextbox(bottom_rect, "Page %i" % pgae.number, align=fitz.TEXT_ALIGN_CENTER).

Re 3 Inserting text with mixtures of ASCII and non-ASCII characters

That is not so easy. Best would be a font which supports both. See this example, which fills a text box with such a mixed text:

import fitz

doc = fitz.open()  # new PDF
ffile = "C:/windows/fonts/msyhl.ttc"  # chinese font, light version

fsize = 10  # fontsize
lheight = fsize * 1.2  # line height
text = u"你好!hello, Hallo! 我很喜欢德国!德国是个好地方!"  # text from Pengfei
text += "More text is following in English and justified ... let's see what happens."
page = doc.newPage()
page.insertFont(fontname="f0", fontfile=ffile)
rect = fitz.Rect(50, 100, 170, 250)
pivot = rect.bl
mat = fitz.Matrix(-30)
morph = (pivot, mat)
rc = page.insertTextbox(
    rect, text, fontname="f0", rotate=90, align=fitz.TEXT_ALIGN_JUSTIFY, morph=morph
)
print("space surplus(+) or deficit(-): %g" % rc)
page.drawRect(rect, width=0.3, morph=morph)
doc.save(__file__ + ".pdf", garbage=3, deflate=True)

The result looks like this:
grafik

@retsyo
Copy link
Author

retsyo commented Jan 1, 2020

thanks

for the case of mixtures of ASCII and non-ASCII characters. Maybe there is no easy way, on the contrast, in Microsoft Word, we can select the text, assign simsun Chinese font, then assign times new romman.

Yes, commonly Chinese font supports Chinese, English letters, numbers and more. But in my case or to follow the rule, times new romman are demanded for English letters and digial numbers, maybe English letters and digial numbers look ugly when simsun Chinese font is used

@JorjMcKie
Copy link
Collaborator

JorjMcKie commented Jan 1, 2020

I understand. Then there is no "easy" way I am afraid and you probably have to do this (per page):

  • insert the required fonts (times new roman TNR, simsun)
  • after that, create a list of character widths for each font using doc.getChardWidths(xref, limit=nnn), where xref is the xref returned by font insertion, and limit is an integer you provide to limit the list length. For the ASCII font, let it default to 256, for the Chinese font, choose a large number over 8000.
  • determine the rectangle within which the given text string must appear
  • loop over the characters of the text string
  • if ord(char) is within the ASCII range (32 <= ord(char) <= 127) insert the char using font TNR, otherwise font simsun.
  • for the given rectangle, the first character must be positioned at fitz.Point(rect.x0, rect.y0 + fontsize*1.2), which is one line height below the rectangle top
  • every next character must be positioned at where the previous one ended
  • this insertion point can be calculated by using the width of the previous character, add (char-width, 0) to the previous character's insertion point.
  • if the new character would exceed the rectangle width, start a new line by choosing insertion point fitz.Point(rect.x0, y) + (0, fontsize*1.2) where y is the height of the current line.

Good luck!

@Allen-LCG
Copy link

Allen-LCG commented May 13, 2022

For the first issue, as the approach I resloved, the page should be wrapped first with page.wrapContents(), if you are trying to edit an existing PDF document. Hope this information will help others, after I struggled for this a whole day.

page = doc[page_no]
page.wrapContents()
font_name = 'msyhl'
page.insert_font(fontname=font_name,fontfile=fontMsyhPath, fontbuffer=None , set_simple=False )
page.insert_text(point=(100, 100), text='你好, hello, halo', fontsize=50, fontname=font_name, rotate=0, render_mode=0)`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants