What Is Parsing?

Parsing is something I came accross alot in development, but as a junior its one of those things I assume I will get the hang of at some point, when its needed. In my current project I've been told to find and use an HTML parser for a certain function, I have found a couple on the web, but what does an HTML parser actually do? And what does it mean to parse an object??
 
Parsing usually applies to text - the act of reading text and converting it into a more useful in-memory format, "understanding" what it means to some extent. So for example, an XML parser will take the sequence of characters (or bytes) and convert them into elements, attributes etc.
In some cases (particularly compilers) there's a separation between lexical analysis and syntactic analysis, so the real "understanding" part of the parser works on a sequence of tokens (identifiers, operators etc) rather than on the raw characters.
 
Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages, conforming to the rules of a formal grammar. The term parsing comes from Latin pars (orationis), meaning part (of speech).[1][2]

The term has slightly different meanings in different branches of linguistics and computer science. Traditional sentence parsing is often performed as a method of understanding the exact meaning of a sentence or word, sometimes with the aid of devices such as sentence diagrams. It usually emphasizes the importance of grammatical divisions such as subject and predicate.

Within computational linguistics the term is used to refer to the formal analysis by a computer of a sentence or other string of words into its constituents, resulting in a parse tree showing their syntactic relation to each other, which may also contain semantic and other information.

The term is also used in psycholinguistics when describing language comprehension. In this context, parsing refers to the way that human beings analyze a sentence or phrase (in spoken language or text) "in terms of grammatical constituents, identifying the parts of speech, syntactic relations, etc."[2] This term is especially common when discussing what linguistic cues help speakers to interpret garden-path sentences.

Within computer science, the term is used in the analysis of computer languages, referring to the syntactic analysis of the input code into its component parts in order to facilitate the writing of compilers and interpreters. The term may also be used to describe a split or separation.

Source: https://en.wikipedia.org/wiki/Parsing
 
Parsing usually applies to text - the act of reading text and converting it into a more useful in-memory format, "understanding" what it means to some extent. So for example, an XML parser will take the sequence of characters (or bytes) and convert them into elements, attributes etc.
 
Parsing is a very important part of many computer science disciplines. For example, compilers must parse source code to be able to translate it into object code. Likewise, any application that processes complex commands must be able to parse the commands. This includes virtually all end-user applications.
Parsing is often divided into lexical analysis and semantic parsing. The lexical analysis concentrates on dividing strings into components called tokens, based on punctuation and other keys. Semantic parsing then attempts to determine the meaning of the string.
 
Parsing in Java methods means that the method is taking input from a string and returning some other data type.

Definition of parse

The actual definition of "parse" in Wiktionary is "To split a file or other input into pieces of data that can be easily stored or manipulated." So we are splitting a string into parts then recognizing the parts to convert it into something simpler than a string.

Parsing an integer

An example would be the parseInt() function. It would take an input such as "123", which would be a string consisting of the char values 1, 2, and 3. Then it would convert this value to the integer 123, which is a simple number that can be stored and manipulated as an integer.

I can see why you might be confused by this simple example, though, since the string "123" doesn't have any obvious parts.
 
Hi,


In computer technology, a parser is a program, usually part of a compiler, that receives input in the form of sequential source program instructions, interactive online commands, markup tags, or some other defined interface and breaks them up into parts (for example, the nouns (objects), verbs (methods), and their attributes or options) that can then be managed by other programming (for example, other components in a compiler). A parser may also check to see that all input has been provided that is necessary.
 
Breaking a data block into smaller chunks by following a set of rules, so that it can be more easily interpreted, managed, or transmitted by a computer. Spreadsheet programs, for example, parse a data to fit it into a cell of certain size.
 
Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural language or in computer languages, conforming to the rules of a formal grammar.
 
Parsing or syntactic analysis is the process of analysing a string of symbols, either in natural ... This term is especially common when discussing what linguistic cues help speakers to interpret garden-path sentences.
 
Back
Top