-
Notifications
You must be signed in to change notification settings - Fork 4
find ranges of page numbers #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
find ranges of page numbers #23
Conversation
|
Note that this adds to requirements.py, the |
ports a feature from PPtools: find page numbers in various formats and display them (roman first, arabic next). understands different formats like Page_1, page_1, page1 (in id attribute) and attempts to parse out span class=pagenum formats like p. 1, [Pg 1] and so forth.
b6c26ee to
4d9cbde
Compare
pphtml.py
Outdated
| import roman | ||
| from time import strftime | ||
| from html.parser import HTMLParser | ||
| import regex as re # for unicode support (pip install regex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we follow the convention of grouping the built-in packages together, a newline, and then the 3rd party?
import sys
import os
import argparse
import itertools
from time import strftime
from html.parser import HTMLParser
import regex as re # for unicode support
import roman
from PIL import ImageIdeally each would be alpha-sorted but I'm not going to get wound around the axle about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done! and sorted :)
Ports a feature from PPtools: find page numbers in various formats and display them (roman first, arabic next).
Understands different formats like
Page_1,page_1,page1(inidattribute). Also attempts to parse numbers from<span class="pagenum">tags (p. 1,[Pg 1], etc.).Example of what this looked like in PPTools:

Example from this change:
Can display multiple ranges if numbers are missing from the sequence: