logo

Coding

  • CS 101 - An Introduction to Programming in Python

Part I • Introduction

  • Introduction
    • The Mind’s Tangle
    • The One Thing Necessary
    • Game Plan
    • Why Python
    • Building a Search Engine
    • Almost a Prerequisite
    • Coding Environment

Part II • Numbers, Strings, Operators, and Methods

  • Numbers, Strings, Operators, and Methods
    • Light Speed
    • String Introduction
    • Strings - Continued
    • Procedures: Built-in Functions and String Methods
    • String Procedures - Extracting a Sub-String
    • Extracting a Link
    • Try It Yourself
    • String Cheat Sheet

Part III • Repetition: Procedures & Control

  • Procedures & Control
    • Introduction
    • Modules in Python
    • Procedures Redux
    • Practicing Procedures
    • Reviewing Procedures and Exercises
    • Control Through True and False
    • Introduction to while Loops
    • Practicing Loops and Conditions
    • Reviewing Loops and Conditions

Part IV • Data Structures

  • Data Structures
    • The Problem
    • List Basics
    • Python’s for Loop

Assignments

  • Instructions
    • Challenge - 05 - while Loops
    • Challenge - 06 - Procedures
    • Challenge 08 - Counting Vowels
    • Challenge 09 Help

Python Cheat Sheet

  • A Rough Cheat Sheet
    • Basics
    • Built-in Functions
    • Control Flow

Addenda

  • Addenda
    • Glossary
    • Bibliography
    • Assignments
Powered by Jupyter Book
  • .ipynb

Extracting a Link

Extracting a Link#

Applying the same methodology for extracting a hextet, we can extract the url from an html anchor tag. The steps can be enumerated:

  • use find() to identify the index location of the first tag on the page - using as an argument to find() the string <a href=

  • by the same method, find the quote " mark’s index position

  • then the second quote mark

  • extract the url from between those two quote marks.

Using a snippet of a web page, try it by launching 🚀 this notebook

Binder

# our data
page = """
    <h1>Lorem ipsum dolor sit amet.</h1>
    <ul>
      <li>
        <a href="https://brave.com">Search</a>
      </li>
      <li><a href="https://docs.python.org/3">Python docs</a></li>
    </ul>
"""

# find the first import index value
start_link = page.find("<a href=")
# find first quotation mark
start_quote = page.find('"', start_link)
# find second quotation mark 
# to find the final quote enclosing the url we increment the start_quote index value
end_quote = page.find('"', start_quote + 1)
# now we use string slicing for extraction
url = page[start_quote + 1: end_quote]

previous

String Procedures - Extracting a Sub-String

next

Try It Yourself

By Barley-Benincasa Lab & Studios ✏️ in cooperation with The Intern
© Copyright 2022-2023.