This python program uses selenium to find all valid links(internal link only), and save it to a csv file. I used DVWA as target, should be logged in at first to display the index or home page.
This program is useful when a project needs to scan for vulnerabilities of a given domain name. If there are thousands of links, then this will be a good automation tool instead of doing it manually.
The source code had comments describing the purpose of each lines to make it readable for beginners in Python programming. Please note that this is not a tutorial, it is meant to share my python programming journey.
the sample output:
Here is the source code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | ## Import webdriver from Selenium Wire instead of Selenium from seleniumwire import webdriver from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC import pandas as pd ## Get the URL driver = webdriver.Chrome()#make sure chromedriver.exe is on the folder as this python program driver.get("http://localhost/dvwa/login.php") # find username/email field and send the username itself to the input field driver.find_element(by=By.NAME, value="username").send_keys('admin') # find password input field and insert password as well driver.find_element(by=By.NAME, value="password").send_keys('test1') # click login button driver.find_element(by=By.NAME, value="Login").click() #find all links elems = driver.find_elements(by=By.TAG_NAME, value='a') data = [] for elem in elems: #get the url from the link x = elem.get_attribute("href") if x.find('http://localhost/dvwa') != -1: data.append(elem.get_attribute("href")) #close the browser driver.close() #close the debug window driver.quit() data = list(dict.fromkeys(data)) #remove duplicates from list df = pd.DataFrame(data) pd.set_option('display.max_colwidth', None) #make sure that the panda print the whole column pd.set_option('display.max_rows', None) #make sure all rows are printed print('lastrow : ' + df.iloc[-1]) #print last row print('') print(df)#display the collected url df.to_csv('url.csv') |
No comments:
Post a Comment