digital literacy for everyone


[lit]

[generate-title]

[lit] i wanted to process some text like this: 1. take each line, show each pair of words (like this) take each each line line show show each each pair pair of of words im sure this is doable in bash. heres the solution i came up with: [figplus] #### license: creative commons cc0 1.0 (public domain) #### http://creativecommons.org/publicdomain/zero/1.0/ p arrstdin # p is array of piped-in text forin each p # loop through each line of p and set each to it eaches split each ' ' plus ' ' # split each by ' ' into array "eaches", add ' ' eachlen eaches len minus 1 # get length of eaches minus 1 for couple 1 eachlen 1 # for loop named couple, from 1 to eachlen step 1 other couple plus 1 # couple is first word, other is second word now eaches mid couple 1 prints ' ' prints # use mid to get item number "couple" from eaches now eaches mid other 1 print # use mid to get item number "other" from eaches next # end for loop next # end forin loop (nextin command also works) but i didnt write this program in an editor, i wrote it gradually on the command line like i would for a bash snippet: cat | figplus05.py -c "p arrstdin \n forin each p \n eaches each print \n next" that will just loop through whatever you paste and print it (pretty useless until we add more) cat | figplus05.py -c "p arrstdin \n forin each p \n eaches split each ' ' plus ' ' \n eachlen eaches len minus 1 \n for couple 1 eachlen 1 \n other couple plus 1 \n now eaches mid couple 1 prints ' ' prints \n now eaches mid other 1 print \n next \n next" you can check this yourself: echo you can check this yourself | figplus05.py -c "p arrstdin \n forin each p \n eaches split each ' ' plus ' ' \n eachlen eaches len minus 1 \n for couple 1 eachlen 1 \n other couple plus 1 \n now eaches mid couple 1 prints ' ' prints \n now eaches mid other 1 print \n next \n next" you can can check check this this yourself yourself but i wanted to know the frequency of pairs of consecutive words (shortest possible phrases) so i added this (using bash): cat | figplus05.py -c "p arrstdin \n forin each p \n eaches split each ' ' plus ' ' \n eachlen eaches len minus 1 \n for couple 1 eachlen 1 \n other couple plus 1 \n now eaches mid couple 1 prints ' ' prints \n now eaches mid other 1 print \n next \n next" | sort | uniq -c | sort -n # public domain echo you can check this yourself check this out | figplus05.py -c "p arrstdin \n forin each p \n eaches split each ' ' plus ' ' \n eachlen eaches len minus 1 \n for couple 1 eachlen 1 \n other couple plus 1 \n now eaches mid couple 1 prints ' ' prints \n now eaches mid other 1 print \n next \n next" | sort | uniq -c | sort -n 1 can check 1 out 1 this out 1 this yourself 1 you can 1 yourself check 2 check this im using this to find (longer) repeated phrases in text. this probably took me about 10 or 15 minutes. though i probably should have tried it with three words instead: cat | figplus05.py -c "p arrstdin \n forin each p \n eaches split each ' ' plus ' ' plus ' ' \n eachlen eaches len minus 1 \n for couple 1 eachlen 1 \n other couple plus 1 \n now eaches mid couple 1 prints ' ' prints \n now eaches mid other 1 prints ' ' prints \n other couple plus 2 \n now eaches mid other 1 print \n next \n next" | sort | uniq -c | sort -n # public domain
back to figplus page: [lit]https://codeinfig.neocities.org/figplus/[lit] home: [lit]https://codeinfig.neocities.org[lit]