w3hello.com logo
Home PHP C# C++ Android Java Javascript Python IOS SQL HTML videos Categories
Optimize Regex to extract content between two tags (or How to select content between two tags with Jsoup selector API?)

You can acheive the same with Jsoup's css selector.

SOLUTION

h2:has(span#JDK_contents) ~
*:not(h2:has(span#Ambiguity_between_a_JDK_and_an_SDK) ~ *):not(h2)

DEMO

DESCRIPTION

For clarity, let's call h2Start an h2 tag having at least one span with id JDK_contents. We'll call too h2End an h2 tag having at least one span with id Ambiguity_between_a_JDK_and_an_SDK.

h2:has(span#JDK_contents)  /* Select an h2Start */
~ *                        /* Select any node preceded by this h2Start...
*/
:not(h2:has(span#Ambiguity_between_a_JDK_and_an_SDK) ~ *) /* ...but not
peceded by an h2End */
:not(h2) /* We remove h2End  */

NOTA: In the case of the JDK wiki page, the last line is enough. More rigourously, we would replace it with :not(h2:has(span#Ambiguity_between_a_JDK_and_an_SDK)).





© Copyright 2018 w3hello.com Publishing Limited. All rights reserved.