ksoup

Kotlin DSL Jsoup implementation with Retrofit2 Converter

License

License

GroupId

GroupId

com.github.timtimmahh
ArtifactId

ArtifactId

ksoup
Last Version

Last Version

0.2.1
Release Date

Release Date

Type

Type

aar
Description

Description

ksoup
Kotlin DSL Jsoup implementation with Retrofit2 Converter
Project URL

Project URL

https://github.com/timtimmahh/ksoup
Source Code Management

Source Code Management

https://github.com/timtimmahh/ksoup

Download ksoup

How to add to project

<!-- https://jarcasting.com/artifacts/com.github.timtimmahh/ksoup/ -->
<dependency>
    <groupId>com.github.timtimmahh</groupId>
    <artifactId>ksoup</artifactId>
    <version>0.2.1</version>
    <type>aar</type>
</dependency>
// https://jarcasting.com/artifacts/com.github.timtimmahh/ksoup/
implementation 'com.github.timtimmahh:ksoup:0.2.1'
// https://jarcasting.com/artifacts/com.github.timtimmahh/ksoup/
implementation ("com.github.timtimmahh:ksoup:0.2.1")
'com.github.timtimmahh:ksoup:aar:0.2.1'
<dependency org="com.github.timtimmahh" name="ksoup" rev="0.2.1">
  <artifact name="ksoup" type="aar" />
</dependency>
@Grapes(
@Grab(group='com.github.timtimmahh', module='ksoup', version='0.2.1')
)
libraryDependencies += "com.github.timtimmahh" % "ksoup" % "0.2.1"
[com.github.timtimmahh/ksoup "0.2.1"]

Dependencies

compile (5)

Group / Artifact Type Version
org.jetbrains.kotlin : kotlin-android-extensions-runtime jar 1.3.50
org.jetbrains.kotlin : kotlin-stdlib-jdk8 jar 1.3.50
org.jetbrains.kotlin : kotlin-reflect jar 1.3.50
org.jsoup : jsoup jar 1.12.1
com.squareup.retrofit2 : retrofit jar 2.6.1

Project Modules

There are no modules declared in this project.

KSoup

A Kotlin DSL for JSoup HTML parsing.

KSoup allows for easily parsing HTML by wrapping the JSoup parser with Kotlin DSL without giving up functionality from the original JSoup library.

KSoup provides helper functions to simplify specifying the CSS selectors and obtaining the desired data. Currently, KSoup provides functions for obtaining:

  • String
  • Float
  • Double
  • Int
  • Long
  • Collection's of any type
  • Maps with any key and value
  • Nested JSoup Element's

Installation

The latest versions of KSoup are hosted on the maven JitPack repository, so in your project root's build.gradle you must add:

allprojects {
  repositories {
    ...
    maven { url 'https://jitpack.io' }
  }
}

Then, you can add KSoup as a dependency to your module:

dependencies {
  implementation 'com.github.timtimmahh:ksoup:master-SNAPSHOT'
}

Use the version master-SNAPSHOT to target the most the latest commit, however you can also use commit tags or release versions. Currently KSoup is on 0.2.1

For additional help, follow the directions as specified on JitPack's website.

Basic Usage

To use KSoup you must first create the model you would like to parse the HTML into where each property is specified with the var keyword in order for the values to be updated upon parsing. Properties don't necessarily need to have a default value, however it does make specifying the instance generator for the model easier by using a constructor reference.

data class DetailedClass(
    var title: String = "",
    var status: String = "",
    var reason: String = "",
    var units: Float = 0f,
    var grading: String = "",
    var grade: String = "",
    var classNumber: Int = 0,
    var section: Int = 0,
    var component: String = "",
    var datesAndTimes: MutableList<String> = mutableListOf(),
    var room: MutableList<String> = mutableListOf(),
    var instructor: MutableList<String> = mutableListOf(),
    var startEndDate: MutableList<String> = mutableListOf(),
    var aidEligible: String = ""
) { constructor() : this("") }

Now that the model is set up, you must now specify the CSS selectors and parse rules for each property.

The following example uses the abstract classParseBuilder<V : Any> which has an abstract property that you can use to lazily build the DSL. The function fun <T : Any> buildParser(instanceGenerator: () -> T, builder: SimpleParser<T>.() -> Unit): Lazy<SimpleParser<T>> is simply a delegate to lazily build the DSL so as not to waste resources when it's not being used.

object DetailedClassBuilder : ParseBuilder<DetailedClass>() {
    override val build: SimpleParser<DetailedClass> by buildParser(::DetailedClass) {
        text("td.PAGROUPDIVIDER", DetailedClass::title)
        text(
            "table[id^=SSR_DUMMY_RECVW\$scroll] tr[id^=trSSR_DUMMY_RECVW] span[id=STATUS$0]",
            DetailedClass::status
        )
        text(
            "table[id^=SSR_DUMMY_RECVW\$scroll] tr[id^=trSSR_DUMMY_RECVW] span[id=ENRLSTATUSREASON$0]",
            DetailedClass::reason
        )
        float(
            "table[id^=SSR_DUMMY_RECVW\$scroll] tr[id^=trSSR_DUMMY_RECVW] span[id=DERIVED_REGFRM1_UNT_TAKEN$0]",
            DetailedClass::units
        )
        text(
            "table[id^=SSR_DUMMY_RECVW\$scroll] tr[id^=trSSR_DUMMY_RECVW] span[id=GB_DESCR$0]",
            DetailedClass::grading
        )
        text(
            "table[id^=SSR_DUMMY_RECVW\$scroll] tr[id^=trSSR_DUMMY_RECVW] span[id=DERIVED_REGFRM1_CRSE_GRADE_OFF$0]",
            DetailedClass::grade
        )
        int(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id=trCLASS_MTG_VW$0_row1] div[id=win0divDERIVED_CLS_DTL_CLASS_NBR$0] > span",
            DetailedClass::classNumber
        )
        int(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id=trCLASS_MTG_VW$0_row1] div[id=win0divDERIVED_CLS_DTL_CLASS_NBR$0] > span",
            DetailedClass::section
        )
        text(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id=trCLASS_MTG_VW$0_row1] div[id=win0divMTG_COMP$0] > span",
            DetailedClass::component
        )
        collection(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id^=trCLASS_MTG_VW] div[id^=win0divMTG_SCHED] > span",
            DetailedClass::datesAndTimes,
            Element::text
        )
        collection(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id^=trCLASS_MTG_VW] div[id^=win0divMTG_LOC] > span",
            DetailedClass::room,
            Element::text
        )
        collection(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id^=trCLASS_MTG_VW] div[id^=win0divDERIVED_CLS_DTL_SSR_INSTR_LONG] > span",
            DetailedClass::instructor,
            Element::text
        )
        collection(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id^=trCLASS_MTG_VW] div[id^=win0divMTG_DATES] > span",
            DetailedClass::startEndDate,
            Element::text
        )
        text(
            "table[id^=CLASS_MTG_VW\$scroll] tr[id=trCLASS_MTG_VW$0_row1] div[id=win0divMTG_AID$0] > span",
            DetailedClass::aidEligible
        )
    }
}

In addition, you can also use the fun element(css: String, convert: (Element, V) -> Unit): Unit function to specify parse rules that can't be uptained by using the other helper functions.

Similarly, the fun elements(css: String, convert: (Element, V) -> Unit): Unit works the same way except instead of only selecting the first css selector that was found, it goes through the returned Elements object to convert each Element into the desired result.

Using Retrofit

If you'd like to use Retrofit to perform the requests, this library provides a Retrofit Converter.Factory called KSoupConverterFactory that will find the correct ParseBuilder for the return type in the request function. For KSoupConverterFactory to find the correct ParseBuilder you must specify it using the @ResponseParser annotation. For example:

@ResponseParser(parser = CurrentUserAdapter::class)
@FormUrlEncoded
@POST("https://canvas.jmu.edu/saml_consume")
fun getCanvasProfileInfo(@Field("SAMLResponse") samlResponse: String): Deferred<Response<CurrentUser>>

Todo:

  • Move the Retrofit Converter to different module
  • Combine parse models with the DSL builder to simplify creation, operation, as well as remove the kotlin-reflect dependency that is currently required for the Retrofit Converter.

License

Licensed under Apache License, Version 2.0

Credits

This library is partially based off of Mikael Gueck's KSoup DSL implementation.

Versions

Version
0.2.1
0.2.0
0.1.0
0.0.2