您的位置:首页 > 教育 > 锐评 > 台州网页设计培训_建筑工程机械人才培训网官网_百度人工客服在哪里找_新网站百度多久收录

台州网页设计培训_建筑工程机械人才培训网官网_百度人工客服在哪里找_新网站百度多久收录

2025/5/25 9:51:39 来源:https://blog.csdn.net/lango_LG/article/details/145784874  浏览:    关键词:台州网页设计培训_建筑工程机械人才培训网官网_百度人工客服在哪里找_新网站百度多久收录
台州网页设计培训_建筑工程机械人才培训网官网_百度人工客服在哪里找_新网站百度多久收录

本系列为加州伯克利大学著名 Python 基础课程 CS61A 的课堂笔记整理,全英文内容,文末附词汇解释。

目录

01 Natural Language Syntax

02 Representing Syntax

03 Reading Data

04 Tree Representation

Ⅰ A tree Represented as a List of Tokens

Ⅱ Finding Branches

05 Manipulating Language

附:词汇解释


01 Natural Language Syntax

Programming languages and natural languages both have compositional syntax.

Utterances from the Suppes subject in the "Child Language Data Exchange System (CHILDES)" project.

02 Representing Syntax

The tree data abstraction can represent the structure of a sentence.

# Tree data abstractiondef tree(label, branches = []):for branch in branches:assert is_tree(branch), 'branches must be trees'return [label] + list(branches) # why listdef label(tree):return tree[0]def branches(tree):return tree[1:]def is_tree(tree):if type(tree) != list or len(tree) < 1:return Falsefor branch in branches(tree):if not is_tree(branch):return Falsereturn Truedef is_leaf(tree):return not branches(tree)def leaves(tree):if is_leaf(tree):return [label(tree)]else:return sum([leaves(b) for b in branches(tree)], [])# Syntaxexample = tree('ROOT',[tree('FLAG',[tree('NP',[tree('DT', [tree('a')]),tree('JJ', [tree('little')]),tree('NN', [tree('bug')])]),tree('.', [tree('.')])])])from string import punctuation
contractions = ["n't", "'s", "'re", "'ve"]def words(t):"""Return the words of a tree as a string.>>> words(example)'a little bug'"""s = ''for w in leaves(t):no_space = (w in punctuation and w != '$') or w in contractionsif not s or no_space:s = s + welse:s = s + ' ' + wreturn sdef replace(t, s, w):"""Return a tree like T with all nodes labeled S replaced by word W.>>> words(replace(example, 'JJ', 'huge'))'a huge bug.'"""if label(t) == s:return tree(s, [tree(w)])else:return tree(label(t), [replace(b, s, w) for b in branches(t)])
>>> example
['ROOT', ['FLAG', ['NP', ['DT', ['a']], ['JJ', ['little']], ['NN', ['bug']]], ['.', ['.']]]]
>>> leaves(example)
['a', 'little', 'bug', '.']
>>> 'a little bug.'
'a little bug'
>>> words(example)
'a little bug'>>> punctuation
'!"#$%&\'()*+,-/:;<=>?@[\\]^_`{|}~'
>>> ['they', 'are', 'coming', 'over']
['they', 'are', 'coming', 'over']
>>> "they'are coming over"
"they'are coming over">>> replace(example, 'JJ', 'huge')
['ROOT', ['FLAG', ['NP', ['DT', ['a']], ['JJ', ['huge']], ['NN', ['bug']]], ['.', ['.']]]]
>>> words(replace(example, 'JJ', 'huge'))
'a huge bug.'

03 Reading Data

Files, Strings, and Lists:

Some files are plain text and can be read into Python as either:

​    One string containing the whole contents of the file: open('/some/file.txt').read()

​    A list of strings, each containing one line: open('/some/file.txt').readlines()

Useful string methods for processing the contents of a file:

strip() returns a string without whitespace (spaces, tabs, etc.) on the ends

>>> ' hello '.strip()
'hello'

split() returns a list of strings that were separated by whitespace

>>> 'hi  there'.split()
['hi', 'there']

replace(a, b) returns a string with all instances of string a replaced by string b

>>> '2+2'.replace('+', ' + ')
'2 + 2'
# Reading treesexamples = """
(ROOT (SQ (VP (COP is)(NP (NN that))(NP (NP (DT a) (JJ big) (NN bug))(CC or)(NP (NP (DT a) (JJ big) (NN bug))))(. ?)))(ROOT (FLAG (NP (DT a) (JJ little) (NN bug)) (. .)))""".split('\n')def read_trees(lines):"""Return trees as lists of tokens from a list of lines.>>> for s in read_trees(examples):...		print(' '.join(s[:20]), '...') ( ROOT ( SQ ( VP ( COP is ) ( NP ( NN that ) ) ( NP ( ...( ROOT ( FLAG ( NP ( DT a ) ( JJ little ) ( NN bug ) ) ( ..."""trees = [] #其实是list嵌套listtokens = []for line in lines:if line.strip():tokens.expend(line.replace('(', ' ( ').replace(')', ' ) ').split())if tokens.count(' ( ') == tokens.count(' ) '):trees.append(tokens)tokens = []return treesdef all_trees(path = 'CHILDESTreebank-curr/suppes.parsed'):return read_trees(open(path).readlines())
# 和上文中words()函数的功能一样,将list转化为string
>>> s = ['a', 'little', 'bug']
>>> ' '.join(s)
'a little bug'
>>> '+'.join(s)
'a+little+bug'>>> len(examples)
11
>>> examples[1]
'(ROOT (SQ (VP (COP is)'
>>> s = examples[1].replace('(', ' ( ')
>>> s
' ( ROOT ( SQ ( VP ( COP is)'
>>> s.split()
['(', 'ROOT', '(', 'SQ', '(', 'VP', '(', 'COP', 'IS)']>>> ts = read_trees(examples)
>>> len(ts)
2

>>> ts[0].count('(')
17
>>> ts[0].count(')')
17>>> data = all_trees
>>> len(data)
35906

04 Tree Representation

Ⅰ A tree Represented as a List of Tokens

# Tree plusdef tree(label, branches = []):if not branches:return [label]else:#致力于把[]变成()并成为list的元素return ['(', label] + sum(branches, []) + [')'] def label(tree):if len(tree) == 1:return tree[0]else:assert tree[0] == '(', treereturn tree[1]#图示详见第Ⅱ部分
def branches(tree):if len(tree) == 1:return []assert tree[0] == '(' #检查点1opened = 1 #统计'('的个数current_branch = []all_branches = []for token in t[2:]:current_branch.append(token)if token == '(':opened += 1elif token == ')':opened -= 1if opened == 1:all_branches.append(current_branch)current_branch = []assert opened == 0 #检查点2return all_branches#调用了升级版tree函数,因此example为带'(', ')'的list
example = tree('FLAG',[tree('NP',[tree('DT', [tree('a')]),tree('JJ', [tree('little')]),tree('NN', [tree('bug')])]),tree('.', [tree('.')])])
Ⅱ Finding Branches
['(', 'NP', '(', 'DT', 'a', ')', '(', 'JJ', 'little', ')', '(', 'NN', 'bug', ')', ')']

~/lec $ python3 -i ex.py

>>> example
['(', 'FLAG', '(', 'NP', '(', 'DT', 'a', ')', '(', 'JJ', 'little', ')', '(', 'NN', 'bug', ')', ')', '(', '.', '.', ')', ')']
>>> leaves(example)
['a', 'little', 'bug', '.']
>>> words(example)
'a little bug.'>>> replace(example, 'JJ', 'huge')
['(', 'FLAG', '(', 'NP', '(', 'DT', 'a', ')', '(', 'JJ', 'huge', ')', '(', 'NN', 'bug', ')', ')', '(', '.', '.', ')', ')']
>>> words(replace(example, 'JJ', 'huge'))
'a huge bug.'
>>> ts = all_trees()
>>> ts[123]
['(', 'ROOT', '(', 'FLAG', '(', 'NP', '(', 'DT', 'a', ')', '(', 'NN', 'rabbit', ')', '(', '.', '.', ')', ')', ')', ')']
>>> labels(ts[123])
'ROOT'
>>> words(ts[123])
'a rabbit.'

05 Manipulating Language

def all_trees(path = 'CHILDESTreebank-curr/suppes.parsed'):return read_trees(open(path).readlines())def replace_all(s, w):for t in all_trees():r = replace(t, s, w)if r != t: #我们确实改变了些什么print(words(t))print(words(r))input() #直到用户输入返回键才停止执行该函数
>>> replace_all('NNS', 'bears')

>>> replace_all('NP', 'Oski')

附:词汇解释

utterance / ˈʌtərəns / 话语,表达、punctuation / ˌpʌŋktʃuˈeɪʃ(ə)n / 标点符号、contraction / kənˈtrækʃn / 缩写、plain / pleɪn / 纯粹的、whitespace 空格、strip / strɪp / 除去、split / splɪt / 分裂、tab 跳格、with instance of 以…为例、parse / pɑːrs /(计算机)句法分析、token (计算机)令牌,记号、manipulate / məˈnɪpjuleɪt / 操作

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com