Multiple issues
wpBots Support – The best crawlers for WordPress › Forums › SCRAPER (after-sales) › General Issues › Multiple issues
- This topic has 1 reply, 2 voices, and was last updated 6 years ago by
Suman M..
-
AuthorPosts
-
October 11, 2019 at 6:02 pm #692
Karine Robert
ParticipantHi,
I tried scaping multiple websites, and tried on my localhost on 2 different computers and also tried on a server, and I don’t know if there is something I am doing wrong, but I can’t make it work, because I am getting different sort of issues.
1 – After fetching the website (in this case it was https://lepointdevente.com, the “Événements à venir” section), I set the item’s path (a post title) and I set the next page button link. It says I have 53 items but it should be 40.
Then I click on “Jump to content” and I get this message: “Please select serial item link or enter content link first! Go to Feed section, select pick tool for serial item on sidebar, select your feed post’s link.”. I already did this step, so I don’t know what else to do. I tried with the save and scrape buttons and also with “trigger once” then went back to the content tab, but it still doesn’t works. Sometimes I retry and I get “Please enter valid URL.” in the post content tab.
2 – I don’t know why, but I sometimes I get to have the content appear in the content tab, but then when I select the title, I get the text element from another element (like now I get the “Main menu” text).
3 – I don’t understand how to get all articles in the page when there is no link to the individual article, like on this website: https://suoniperilpopolo.org/fr/events or on this one http://leministere.ca/evenements-4/
I have tried on fresh WordPress installations, with only the scraping plugin installed and with the default WP theme. So it’s not a conflict with another extension or the theme. In the settings, it says that cURL is enabled. The PHP version installed is 7.2.
Is there special server requirements to make your plugin work correctly?
Thanks!
October 13, 2019 at 3:16 pm #693Suman M.
KeymasterHi, thanks for contacting us. Please find my comments below.
1) It’s because other blocks/sections also use the same class name. In such case you’ll need to manually enter/update the XPath. Please enter the following XPath in “item’s path” field, and then you can click on “jump to content” button.
//div[contains(@id,"events-list")]//a[contains(@class,"feature-link")]2) It might be the same issue – same class name being used in multiple places. Can you please let us know where and when exactly you get this issue so that we can help you with it.
3) I didn’t exactly get you here. I checked https://suoniperilpopolo.org/fr/events and could scrape from it – https://www.screencast.com/t/O3vawB0iYX3s
Plugin requires that the PHP memory limit is at least 512MB and max execution time is at least 180 seconds.
-
AuthorPosts
You must be logged in and have valid license to reply to this topic.