Qt-interest Archive, May 2008
QWebkit - get web links
Message 1 in thread
Hi,
Is it possible to get all the links of an HTML document with QWebKit?
Thanks,
Paulo
--
[ signature omitted ]
Message 2 in thread
"Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
news:200805061724.10118.moura@xxxxxxxxxxxxxxxx
> Hi,
>
> Is it possible to get all the links of an HTML document with QWebKit?
Sure it is. There are several options. QWebkit does not offer DOM access
(yet), but you can just parse the HTML using regexps or, if you have xhtml,
using one of the XML classes. Alternatively, you can insert some javascript
into the page to get al the links.
André
--
[ signature omitted ]
Message 3 in thread
On Tuesday 06 May 2008 18:58:19 André Somers wrote:
> "Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
> news:200805061724.10118.moura@xxxxxxxxxxxxxxxx
>
> > Hi,
> >
> > Is it possible to get all the links of an HTML document with QWebKit?
>
> Sure it is. There are several options. QWebkit does not offer DOM access
> (yet), but you can just parse the HTML using regexps or, if you have xhtml,
> using one of the XML classes. Alternatively, you can insert some javascript
> into the page to get al the links.
I asked if it was possible using QWebKit API, of course I can use regular
expressions and what not.
BTW, DOM access would be very interesting, is it planned for >= 4.5?
Thanks,
Paulo
--
[ signature omitted ]
Message 4 in thread
Hi,
"Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
>> > Is it possible to get all the links of an HTML document with QWebKit?
>>
>> Sure it is. There are several options. QWebkit does not offer DOM access
>> (yet), but you can just parse the HTML using regexps or, if you have
>> xhtml,
>> using one of the XML classes. Alternatively, you can insert some
>> javascript
>> into the page to get al the links.
>
> I asked if it was possible using QWebKit API, of course I can use regular
> expressions and what not.
Technically you asked if you could do it "with QWebKit". Well, you can. I
think the most interesting way is to use a javascript piece you insert into
the web frame. You can use QWebFrame::evaluateJavaScript to start your own
script in the page you loaded. In javascript, iterating over the links is
relatively easy as you *can* access the DOM. You can get the results back in
different ways. You can use the resulting QVariant, but you can also make a
QObject available to java script and use that to get the data into your
application.
> BTW, DOM access would be very interesting, is it planned for >= 4.5?
AFAIK, things like that are planned for a future version, yes.
André
--
[ signature omitted ]
Message 5 in thread
On Wednesday 07 May 2008 06:21:13 André Somers wrote:
> Hi,
>
> "Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
>
> >> > Is it possible to get all the links of an HTML document with QWebKit?
> >>
> >> Sure it is. There are several options. QWebkit does not offer DOM access
> >> (yet), but you can just parse the HTML using regexps or, if you have
> >> xhtml,
> >> using one of the XML classes. Alternatively, you can insert some
> >> javascript
> >> into the page to get al the links.
> >
> > I asked if it was possible using QWebKit API, of course I can use regular
> > expressions and what not.
>
> Technically you asked if you could do it "with QWebKit". Well, you can. I
> think the most interesting way is to use a javascript piece you insert into
> the web frame. You can use QWebFrame::evaluateJavaScript to start your own
> script in the page you loaded. In javascript, iterating over the links is
> relatively easy as you *can* access the DOM. You can get the results back
> in different ways. You can use the resulting QVariant, but you can also
> make a QObject available to java script and use that to get the data into
> your application.
I didn't understand you well, I see you mean using javascript programatically.
This seems like a very interesting approach, I will try it for sure. ;)
Thanks
Paulo
--
[ signature omitted ]
Message 6 in thread
On Wednesday 07 May 2008 06:21:13 André Somers wrote:
> I
> think the most interesting way is to use a javascript piece you insert into
> the web frame. You can use QWebFrame::evaluateJavaScript to start your own
> script in the page you loaded. In javascript, iterating over the links is
> relatively easy as you *can* access the DOM. You can get the results back
> in different ways. You can use the resulting QVariant, but you can also
> make a QObject available to java script and use that to get the data into
> your application.
I was trying code like "window.open('link.htm');" hoping some signal was
emmited with the associated url, like urlChanged for example, but without
success. :( Isn't this possible?
Paulo
--
[ signature omitted ]
Message 7 in thread
Hi,
> I was trying code like "window.open('link.htm');" hoping some signal was
> emmited with the associated url, like urlChanged for example, but without
> success. :( Isn't this possible?
http://doc.trolltech.com/4.4/qwebview.html#urlChanged or
http://doc.trolltech.com/4.4/qwebframe.html#urlChanged doesn't work for you?
If that is the case, I'd say you've found a bug.
André
--
[ signature omitted ]
Message 8 in thread
On Monday 12 May 2008 11:25:28 André Somers wrote:
> Hi,
>
> > I was trying code like "window.open('link.htm');" hoping some signal was
> > emmited with the associated url, like urlChanged for example, but without
> > success. :( Isn't this possible?
>
> http://doc.trolltech.com/4.4/qwebview.html#urlChanged or
> http://doc.trolltech.com/4.4/qwebframe.html#urlChanged doesn't work for
> you? If that is the case, I'd say you've found a bug.
I tried the last
connect(page.mainFrame(), SIGNAL(urlChanged(const QUrl&)), this,
SLOT(slotUrlChanged(const QUrl&)));
If someone confirm this as a bug I can report it.
Thanks,
Paulo
--
[ signature omitted ]
Message 9 in thread
"Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
news:200805121559.44992.moura@xxxxxxxxxxxxxxxx
> On Monday 12 May 2008 11:25:28 André Somers wrote:
>> Hi,
>>
>> > I was trying code like "window.open('link.htm');" hoping some signal
>> > was
>> > emmited with the associated url, like urlChanged for example, but
>> > without
>> > success. :( Isn't this possible?
>>
>> http://doc.trolltech.com/4.4/qwebview.html#urlChanged or
>> http://doc.trolltech.com/4.4/qwebframe.html#urlChanged doesn't work for
>> you? If that is the case, I'd say you've found a bug.
>
> I tried the last
>
> connect(page.mainFrame(), SIGNAL(urlChanged(const QUrl&)), this,
> SLOT(slotUrlChanged(const QUrl&)));
>
> If someone confirm this as a bug I can report it.
Just guessing here: could it be that, as the page changes, the Frame object
is actually destroyed and replaced by a new one? In that case, the signal
would be disconnected (as the frame object is destroyed), and the slot would
never be called. However: this is just conjecture. You can check this easily
by just comparing page.mainFrame() before and after your URL change: if the
two are the same, I'd say it is the same object. In that case: you have your
issue. Main thing is to create a minimal, compilable example that shows your
problem before you report it.
André
--
[ signature omitted ]
Message 10 in thread
On Monday 12 May 2008 16:24:00 André Somers wrote:
> "Paulo Moura Guedes" <moura@xxxxxxxxxxxxx> schreef in bericht
> news:200805121559.44992.moura@xxxxxxxxxxxxxxxx
>
> > On Monday 12 May 2008 11:25:28 André Somers wrote:
> >> Hi,
> >>
> >> > I was trying code like "window.open('link.htm');" hoping some signal
> >> > was
> >> > emmited with the associated url, like urlChanged for example, but
> >> > without
> >> > success. :( Isn't this possible?
> >>
> >> http://doc.trolltech.com/4.4/qwebview.html#urlChanged or
> >> http://doc.trolltech.com/4.4/qwebframe.html#urlChanged doesn't work for
> >> you? If that is the case, I'd say you've found a bug.
> >
> > I tried the last
> >
> > connect(page.mainFrame(), SIGNAL(urlChanged(const QUrl&)), this,
> > SLOT(slotUrlChanged(const QUrl&)));
> >
> > If someone confirm this as a bug I can report it.
>
> Just guessing here: could it be that, as the page changes, the Frame object
> is actually destroyed and replaced by a new one? In that case, the signal
> would be disconnected (as the frame object is destroyed), and the slot
> would never be called. However: this is just conjecture. You can check this
> easily by just comparing page.mainFrame() before and after your URL change:
> if the two are the same, I'd say it is the same object. In that case: you
> have your issue.
Yes, the mainFrame() object doesn't change :/
> Main thing is to create a minimal, compilable example that
> shows your problem before you report it.
No problem, that's just what I did to try this out :)
Thanks,
Paulo
--
[ signature omitted ]
Message 11 in thread
On Monday 12 May 2008 16:24:00 André Somers wrote:
...
> Main thing is to create a minimal, compilable example that
> shows your problem before you report it.
There is yet another strange behavior:
page.mainFrame()->load(url1);
page.mainFrame()->url() == url1
page.mainFrame()->load(url2);
page.mainFrame()->url() == url1
When I load the second url (url2) the frame still returns the first url
(url1). I'm calling the url() function after receiving loadFinished signal.
This seems to obvious to have passed unnoticed...
Paulo
--
[ signature omitted ]