AF
HomeTagSubmit NotesAsk AnythingLoginSubscribe Us
AF
1. Feel Free to ask and submit anything on Anyforum.in and get satisfactory answer
2. Registration is not compulsory, you can directly login via google or facebook
3. Our Experts are looking for yours ?.



java-file-handling: Marathi voters list pdf parsing

I have voters list pdf that is in Marathi language.I am parsing this using itext java lib but i am getting wrong characters. please help me regarding this.

java x 210
file-handling x 9
Posted On : 2018-06-19 09:10:13.0
profile rahul patil - anyforum.in rahul patil
42-30
up-rate
4
down-rate

Answers


You need to use iText 7. Please check below code snippet-


// iText imports
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor;
public class HindiMarathiText {

@Test
public void go() throws Exception {
try (PdfDocument doc = new PdfDocument(new PdfReader("input.pdf"))) {
try (OutputStream os = new FileOutputStream("output.txt")) {
String result = PdfTextExtractor.getTextFromPage(doc.getPage(3));
os.write(result.getBytes(Charset.forName("UTF-16")));
}
}
}
}


Note:
-----------------------------
You need to build iText 7 from source (https://github.com/itext/itext7) to achieve the above quality. This functionality is available in iText 7.0.2 release. You can have a look here: iText 7.0.2 release Note- Click Here


Some useful references:
--------------------------------------------

iText7 pdfcalligraph- Click Here


iText Language Specific Examples- Click Here

Posted On : 2018-06-19 22:21:07
Satisfied : 0 Yes  0 No
profile Rishi Kumar - anyforum.in Rishi Kumar
523188222437
Reply This Thread
up-rate
0
down-rate



Post Answer
Please Login First to Post Answer: Login login with facebook - anyforum.in