How to build and Android with boilerpipe depedency

21 Views Asked by At

I want to use boilerpipe https://github.com/kohlschutter/boilerpipe in my Android app but I'm unable to build the app with this depedency.

I have in build.gradle:

implementation group: 'com.syncthemall', name: 'boilerpipe', version: '1.2.2'

The code which uses boilerpipe:

import de.l3s.boilerpipe.document.TextDocument;
import de.l3s.boilerpipe.extractors.CommonExtractors;
import de.l3s.boilerpipe.sax.BoilerpipeSAXInput;
import de.l3s.boilerpipe.sax.HTMLDocument;
import de.l3s.boilerpipe.sax.HTMLFetcher;
import org.xml.sax.SAXException;

...
final HTMLDocument htmlDoc = HTMLFetcher.fetch(new URL("https://dzone.com/articles/database-connection-pooling-in-java-with-hikaricp"));
final TextDocument doc = new BoilerpipeSAXInput(htmlDoc.toInputSource()).getTextDocument();
String content = CommonExtractors.ARTICLE_EXTRACTOR.getText(doc);
...

When I run the code above from a JUnit-test it works. But when I try to build the app and run it on a device I get the following error:

Duplicate class org.cyberneko.html.HTMLElements found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLElements$Element found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLElements$ElementList found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLTagBalancer found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLTagBalancer$ElementEntry found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLTagBalancer$Info found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)
Duplicate class org.cyberneko.html.HTMLTagBalancer$InfoStack found in modules jetified-boilerpipe-1.2.2 (com.syncthemall:boilerpipe:1.2.2) and jetified-nekohtml-1.9.20 (net.sourceforge.nekohtml:nekohtml:1.9.20)

I analyzed the app dependencies and found out that the dependency nekohtml is only used in boilerpipe.

+--- com.syncthemall:boilerpipe:1.2.2
|    +--- net.sourceforge.nekohtml:nekohtml:1.9.20
|    |    \--- xerces:xercesImpl:2.10.0 -> 2.11.0
|    |         \--- xml-apis:xml-apis:1.4.01
|    \--- xerces:xercesImpl:2.11.0 (*)

So why are there some class collisions?

After that I created a general java Maven-based project with the same dependency and used boilerpipe there:

<dependency>
   <groupId>com.syncthemall</groupId>
   <artifactId>boilerpipe</artifactId>
   <version>1.2.2</version>
</dependency>

No class collisions were reported. What does Android do differently? Does it have something to do with jetified- prefix?

0

There are 0 best solutions below