More challenging projects every programmer should try(web.eecs.utk.edu) |
More challenging projects every programmer should try(web.eecs.utk.edu) |
A long time ago I wanted to code a neogeo emulator but gave up before I even started, I didn't have a clue where to begin.
I am amazed at anyone that can code an emulator from scratch.
it mostly boils down to keeping a bunch of registers and a giant switch statement. Each case simply implements the opcode. You have an array of bytes for the memory, and some emulated devices (e.g. trigger a screen update when the framebuffer memory gets changed, or set the instruction pointer to a handler when a key gets pressed.)
It gets hard when instructions need lots of decoding, or you have 3d graphics hardware to emulate, or if you have something like a BIOS, or if you want to JIT.
I'm actually eyeing implementing a console on an FPGA as my holiday project, with something like Chisel.
Although that's not "from scratch" because it would still be using their libraries and plumbing, it could be "from scratch" in the sense that you might take an environment with 0% support for emulating a certain device or system and build it up to having 100% support. So it seems like a nice way to start.
Before attempting to do so I thought it was implemented as a simple seek over the string, maybe a bunch of regex stuff. I guess it can be done that way, at the cost of growing complexity; but the proper solution (with a stack, etc) is so elegant (makeing it easy to add functions, operators, parenthesis, variables, etc) that it really makes one appreciate the value of good, thoughtful engineering.
Has anyone tried CGI if so how’s your experience has been so far
I want to try this. Where can u get access to historical pricing data that includes pricing changes during the day, not just end of day prices?
- You are trading in a market with low liquidity or one that is controlled by a small number of market participants. I'm not an expert but I think this would apply more to markets like penny stocks and less to big markets like forex for major currency pairs
- You are not taking transaction costs into account or not doing so properly
- Your bot makes a low number of trades, making the results close or equivalent to lucky coin flips
- Your bot is simply making trades that cannot be executed, or may be doing simulated trades of something that is not actually tradable. This applies to a large number of research papers that assume you can just buy and trade the S&P 500 itself. You can trade ETFs that are tied to an index but an index is not a tradable instrument in of itself. Once you realize this, a lot of papers seem very weird
- You are not modelling other aspects of the trading process realistically, such as assuming the bot has infinite funds to trade, allowing it to take unlimited losses and continue trading when in reality you'd be hit with a margin call and your trading would be stopped
- Your code is committing any number of data snooping errors where the bot is asked to trade at time A (say the open of a trading session) but has access to future data (say the closing price of that day, future data that would not actually exist in a live environment)
- Depending on what you believe about how market conditions change over time, your bot may have worked in the past but would not work if used today. I.e., the market may have adapted to whatever edge your bot may have discovered
There are probably lots more pitfalls I don't even know about since I'm not an actual trader.
I'm not discouraging anyone from playing around or trying things, of course. I think it's great fun, which is why I do it.
Here's the good news: if you realize you don't actually have an edge and avoid risking your hard-earned money, you come out ahead of almost all people who ever trade.
* Doing a latency-sensitive trade when you don't have good execution. It's easy to go wild in simulation and think you can flip in and out of positions. But if you're a retail trader (and in this context, by that I mean "not connected directly to the exchanges, at the minimum")
* Not taking into account the impact of your own trading on markets. This is obviously impossible to really simulate. Sometimes you can ignore it (if there's enough liquidity) but I've seen trades that looked great on paper and when trading at small-ish sizes, but then when you try to crank it up and do more volume, prices run away from you.
Obviously, there is money to be made in algo trading. It's big business and obviously not everyone is doing crazy latency sensitive stuff- there are quant trades that you could probably do with execution available to retail traders. And honestly I wouldn't be surprised if some retail traders manage it. But I will say that I think for an individual, it's not worth the effort even if you are ones of the ones who can consistently make a profit. Just buy sp500 ETFs and sit on them and do something else with your time.
I've found that stuff a lot. People forget that on the other end of the price is a thinking human or robo delegate thereof trying to outsmart you. It's really hard to say what will happen until you try it with real money / assets.
I hear these horrid tales of people killing themselves over bad trades and lines of credit and I wonder how much is poorly written code and how much is the chaos and whim of the market (don't worry, I'm not going to try algo trading).
It didn't make money in the real world because the quality of decision that could be made at that speed wasn't good enough.
If you do too much of this you will get angry exchange network people yelling at you as they may consider it spam/ddos. Some exchanges explicitly limit this.
It's also worth considering the possible gain. If you're in the fpga (0-150ns) or cpu(300ns-3us) space, the math can come out differently.
Failing the checksum was sort of common as well, but exchanges don't like it.
These days most tricks have to do with avoiding as much serialisation time as possible, e.g. by sending ahead part of the TCP payload and filling in with some innocent order if there was no opportunity.
Replace widget with liquidity and factories with traders and you should immediately see why most traders are losing money. They are operating in markets where their liquidity simply isn't needed.
If you want to make money off of trading then you have to find a market with low liquidity where having more traders is actually welcome but why do hard work if you can just invest your money and still get good gains without working?
- management fees, margins, and available capital can easily be modelled properly
- you can easily set up constraints (like no fractional trading for your S&P example)
- you can set up a proper point-in-time database to avoid snooping (especially if you're using earnings reports or other fundamental data which is often actually released _after_ it's published release date...)
- you can set up regime-shifting simulation environments (various market conditions)
- you can avoid over-fitting if you're back-testing (with dozens of techniques, most notably : test once and forget about parameter optimization)
I would say that with paper trading and back-testing the serious problems are that :
- your orders don't show up on the book so no one sees and reacts to your limit orders
- your "filled" orders don't affect the book, so you're not affecting liquidity, so the market doesn't change in response to your trading
- your bot has no access to market micro-structure strategies and conditional orders (and if you want to trade fast or are placing big trades you need them)
These are the problems that make any simulation unrealistic, and they are fundamental. It's shadowboxing, which is not entirely devoid of value, but which is certainly insufficient on its own.
(I've worked as a quant developing strategies for several funds these past 15 years)
My main mistake was using the historical trades instead of the historical offers as a testing dataset.
Out of curiousity, what's the difference?
A couple of years ago I implemented a toy regex engine from scratch (building NFAs then turning them into DFAs). I thought it was an enlightening experience because it showed me that the core principles behind regular languages are fairly simple, although you could spend years optimizing and improving your implementation. How do you deal with unicode? How do you modify your implementation to know how many characters you can skip if you don't have a match in order to avoid testing every single position in a file?
It demystified the concept of a regex engine for me while at the same time making me realize how impressive the advanced, ultra optimized engines we use and take for granted are.
It's poor advice for someone who already has a STEM degree and wants to build something useful and profitable. If you already know how these things work, your time is better spent on the "edge of the circle": http://matt.might.net/articles/phd-school-in-pictures/ which applies to businesses and startups as well.
If you're in the latter group -- you've already got the skills to build real shit. Don't waste your time on homework problems. Find a problem you have and build a solution for it. Don't listen to people who tell you to work on homework problems that have already been solved; it's a complete waste of your time if you already know the fundamentals.
As for stock trading bots -- if you don't have a mathematics degree or equivalent (e.g. having incredible math skills), don't even bother. You won't be profitable, and you will learn nothing useful in the process, because you will approach the problem as a naive CS student would. Smarter people than you have made trading bots and have failed miserably. Without having an extremely strong foundation in mathematics, your trading bot will amount to nothing more than a futile exercise in gluing APIs together.
Which is why lists like this help.
> They are programmers, not creative after all.
Damn, this is a bold comment on HN. Since when does profession determine how creative you are? I've met "programmers" and "geeks" who are more creative than "artists". Your profession has little to do with your innate creativity -- it just determines how you are able to express that creativity.
Unpopular opinion, but lists like this are stupid for people who are trying to build companies. You have to try things and be pissed off at the status quo to find real problems. Nobody is going to find real problems for you, in the same way that no quant school is going to reveal their hedge fund's trading strategy to you. Finding ideas in a list is the last advice I would give to anyone. If it's public, it's probably not a profitable idea.
What is an "intuitive ordinal notation"? Definition: The set of intuitive ordinal notations is the smallest set P of computer programs with the following property. For every computer program p, if, when p is run, all of p's outputs are elements of P, then p is in P.
So "End.", the program which immediately ends with no outputs, is vacuously in P (all of its outputs are in P, because it has no outputs). It notates the ordinal 0. Likewise, "Print(`End.')" is in P, because its sole output, "End.", is in P; it notates the ordinal 1. Likewise, "Print(`Print(End.')')" is in P, notating the ordinal 2. And so on.
The above can be short-circuited: "Let X=`End'; While(True){Print(X); X=`Print(\`'+X+`\')'}". This program outputs "End.", "Print(`End.')", "Print(`Print(`End.')')", and so on forever, all of which are in P, so this program itself is in P. It notates omega, the smallest infinite ordinal.
Here's a library of examples in Python, currently going up to a notation for the ordinal omega^omega: https://github.com/semitrivial/IONs
Web scraping to gather data, databases for storing it, ML for analyzing, front and backend web dev to show the daily information and adjust.
And instead of having to deal with trading regulations, contests can be really small and easy to enter. There are daily contests for 5 cents an entry, and you can enter 150 optimized lineups from an uploaded csv for $7.50 a day. You can really learn a ton.
Could you describe your approach? What type of information would you scrape?
By the way, adventofcode.com is currently ongoing. Though the challenges are easy compared to the projects in this list, I highly recommend it. It covers problems you might face in big projects. With these small puzzles it's easy to experiment. It prepares you for bigger things.
https://github.com/lexborisov/Modest
https://github.com/ArthurHub/HTML-Renderer
Along the same lines, some other challenging projects I recommend are to write decoders/renderers for existing formats like MP3, MP4, PDF, etc.
A fun project I did after that was writing a AI frame language to do goal-stack problem solving, specifically with path finding. I connected it to the ray tracer and made movies of spheres having wars. (I used an unlicensed DivX encoder to stitch together thousands of GIFs.)
You can very quickly get something on the screen with it, although getting an intuition for how it works may take a little longer and some reading around the concepts it introduces. But it does let you focus on the maths rather than worrying about also learning how Web/OpenGL works too.
- Count the number of zero crossings - Find out where they are - Create any shape of wave by adding together multiple sine waves - Hard clip the signal - Stretch a signal and interpolate it with new samples - Invert and revert a signal
For level 2, you can start processing "live":
- Create a sine synthesizer - Create a small ring buffer of samples - Find out how to output that audio (system audio, soundcard) - Add MIDI support - Add polyphony support
DSP gets hard once it has to be in real time and the latency has to be minimal. It's great exercise to mess around with it.
I guess it would depend on how good the data is and which keyboard is being used.
I know that this is possible on Android.
Better yet: do it in C. There's no "dictionary" object type so you have to make it yourself. You'll soon learn a whole bunch of fallacies about how those "dictionaries" actually work. After you spent a good deal of time doing that, you can switch to authentication/authorization, logging, storage, tracing, API management, resource quotas, and a raft of distributed computing issues.
I recommend basing it on Consul, it has a better general model than etcd.
I love, most of all, how modular the project is. I can do an hour here or there and make meaningful progress.
I'm really eager to discover other very large programming projects that break down into sensible bites so well.
Your comment prompted me to go look up some other attempts, and I’m really glad I did. It seems much more approachable to me now. Thanks!
(the flip side of 80-20 is: "all systems are fault tolerant, it's just that in most of them, the human is the component which tolerates the faults")
Of my IT daily drivers, I've done toy:
- web browser / server
- email client
- document formatter
- text editor
- window manager
- 3D / 2D graphic slicers/rasterizers w/ alphabetic fonts
- shell
- interpreters / compilers
- operating system
- VHDLish CPU
- various data encodings (Hamming, MFM, etc.)
- discrete transistor logic
(when I was just starting to program, I discovered the home directory of a colleague of my father's contained many experiments of this kind, and reading his work taught me C)[0] https://alessandrocuzzocrea.com/how-i-made-a-ray-tracer/
It's much easier than it may seem, architecting it is interesting, and there is a lot of "last 10%" stuff which keeps it fun as long as you keep going.
In the demystifying area, it demystified HTML and JS history for me, forced me to use with a minimal toolkit, and taught me how to build "modern" JS features in ways which will not break browsers which don't know how to do them or have them disabled.
Make a small 2D platform game, and it covers so many areas (and it is a lot of fun!).
Like, you probably wont make a lot of money making another 2d platformer no matter how well you code, they are so easy to make that there are millions of them out there already. However if you make a performant and bug free factorio or minecraft clone you will at least get a few thousand people try it and from there it would grow if it is fun.
But I can tell you there is no shortage of oversellers in the programming community.
A programmer who's really a sleazy salesman at heart would jump at figuring out a fizbuzz solution on their own for the first time in their life, and would proceed to mark it as the biggest achievement in the history of mankind, make a webservice out of it, shout at the top of their lungs on social media, and would start knocking on VC doors. (hint hint: many of them actually get away with something not too different).
It starts with basic things like waypoint systems vs. area awareness systems plus the relevant routing algorithms like A*, but goes on to organizing a group of players and finding good strategies. And all of that with a limited time budget and an changing environment around you. Last but not least, you want to emulate human behavior which is probably the hardest part as it includes changing you behavior according to your situation (don't run straight against a wall for 10 seconds) but also taking into account the weaknesses as e.g. humans can't aim perfectly.
Granted, what I have done has a huge field of challenges, but even with a 2D engine I think you can learn a lot from the experience.
I put a bunch of links and quotes about that here, including nascent implementations:
http://www.oilshell.org/blog/2020/07/ideas-questions.html
Also related: http://www.oilshell.org/blog/2020/07/eggex-theory.html
About Unicode, this derivatives project (with video linked in the post) appears to be motivated by Unicode support (though I don't recall exactly why, something about derivatives makes it easier?).
https://github.com/MichaelPaddon/epsilon
https://github.com/MichaelPaddon/epsilon/blob/master/epsilon...
If anyone wants to write a glob engine for https://www.oilshell.org/ let me know :) Right now we use libc but there are a couple reasons why we might have our own (globstar and extended globs)
Trivia: extended globs in bash give globs the power of regular expressions, e.g.
[[ abcXabcXXabcabc == +(abc|X) ]] ; echo $?
0
where +(abc|X) is equivalent to (abc|X)+ in "normal" regex syntax, and == is very confusingly the fnmatch() operator.The derivatives approach makes Unicode support easier since its able to keep the symbols sets for each transition edge (in the DFA) more compact by virtue of supporting negation. If you add in aggressive term-normalization, hash-consing, and an efficient dense-set implementation (all of which I’ve done in my implementation), the derivatives approach can be extremely fast, even when generating the DFA for something like the lexer of a full programming language (in my case, F#).
* searching
* implementation of automata in electronic circuits
* challenges of formal specifications for things like protocols and grammars, as well as for verifying their correctness; implementation strategies for applying these specifications
* computability and complexity
* programming language theory
* history of computer science
* LANGSEC arguments
in addition to having an austere mathematical beauty.
I disagree. Not everything is about business and money. Many people already build "real shit" for a living and want to simply have fun building other things, and focus on the cool parts, and not all the boring parts involved in a commercial project.
Also CS is constantly evolving. Nobody knows the "fundamentals" once for all. A ray tracer is still a ray tracer, but languages and technologies have changed immensely in just a few years. Git didn't exist 15 years ago. A langage like Rust is 10 years old. React is 7 years old. We need these homework problems simply to keep up to date.
None of them were innovative (as in something new) nor belong to CS fundamentals.
I'm having fun, and that's reason enough.
Fast forward a few years, I am taking CS219 with Prof Stark (random that I remember the course number) who is hard-core and really tough since it's year 2000 and the class is full of kids who are taking CS cuz it's "the thing" but have no passion or talent for programing.
Me, I love programming but I don't have it very much together attendance-wise so I accidentally miss the midterm. OOPS. And obviously there's a "zero make-up test" policy, but surprisingly the prof lets me do the part of it which is a take-home coding assignment, since you can't really benefit from prior knowledge of the questions.
My lucky stars - the test is to make a rudimentary subset of - you guessed it - Tetris. Which I had "solved" for myself a year or two earlier. Apparently I was the only one in the class to nail the implementation.
2. I f a course is super hard, maybe the class isn't 'full' of people who are just doing it to 'be cool, maybe that is your judgement and does not affect reality.
3. Great that you solved Tetris beforehand, but is there a point here? Are you implying that high school you was smarter than university peers?
Sorry, but your post seems a little elitist, even thought it's just an anecdote.
And it brings back a fun memory:
I wrote MacTetris[1] when I was in high school. This made the computer lab a whole lot more popular during free periods than it had been previously.
Two interesting bugs that I recall:
* Mathematically rotating pieces around an axis was a terrible idea, but it produced some entertaining artifacts (and made placement much harder!). I replaced the math by precomputed rotation maps for each piece, which was much better. My first pass at the maps introduced a displacement bug, so you could spin the pieces counterclockwise and they would walk in the negative X direction.
* I got an angry bug report in the lunch room from a kid who had no reason to know my name. He was having a really great game, and then his score started decreasing with every piece. He felt like his record high score had been stolen from him, and he was upset! I asked "what was your score??". He said "I don't know, but by the time I noticed, it was over 30,000 but it was going DOWN!". Aha..[2] :)
[1] I'm sure the statute of limitations has expired on my appropriation of copyrights and trademarks.
[2] Back in the day, "int" meant "signed 16-bit integer", which is not the proper data type for a score counter.
Damn, that was good.
Having to learn everything needed to write this... it was unbelievably educational from a software design point of view
If you enjoy horror or third person fantasy adventure games, give it a shot. If you do, please fill out the survey at the end so I can make it better for the next release
In fact, this summer I wrote a 3-part tutorial series on implementing a BASIC compiler: Let's make a Teeny Tiny compiler (https://web.eecs.utk.edu/~azh/blog/teenytinycompiler1.html)
My first program ran on a CHIP-8 machine (COSMAC VIP), though I didn’t realize I was targeting an interpreter and not machine code.
Great series of articles!
Parsing (should be) easy, the backend is hard but well documented and trodden, but the semantic analysis and error handling is where the real murky water is (Especially when you start trying t optimize it, like adding caching or threading or deferred execution)
You write a compiler for a Java-like language in several steps: a parser that organizes the raw code, then a compiler that emits the virtual machine code, then a translator between the virtual machine code and assembly (and then, between assembly and binary).
I've considered reworking it into a compiler, but never quite gotten around to it. Perhaps a challenge for the near year.
Plus there’s no good way within the context of the puzzles to find out what mathematical trick you need if you don’t already know; you need to go find a virtual water cooler.
I may simply be biased because each year it reveals how little I know, but I much prefer interesting programming problems that don’t require me to go to Reddit, read other people’s description of what math is required, go learn the math involved, and then finally implement a solution.
Most of the problems are around searching a space for some solution or just simulating some state changes. Recent problems involved implementing a higher dimensional version of Conway's game of life[0], a simple arithmetic expression evaluator [1], or a simulator for a simple number game[2] e.g.
The most recent one[3] involves solving a jigsaw puzzle by using a simple backtracking search (or any number of other methods). It's a bit complex, but not reliant on a particular math trick.
The vast majority of the problems in advent of code shouldn't require any math tricks, though they're often complex and involved, particularly as the month goes on.
[0] https://adventofcode.com/2020/day/17
[1] https://adventofcode.com/2020/day/18
For 2015, and 2020 so far - its mostly text parsing, data structure building and basic iteration/permutations.
I stopped doing advent this year when I realized I was spending more time debugging my parsing than I was actually solving the problems. It's just not that fun for me to sometimes spend 2+ hours finding parse bugs before moving on to the actual puzzle.
They are almost all algorithm based rather than math based.
The book guides the reader in implementing a graphical web browser, starting with HTTP and HTML then moving on to the layout, the box model, CSS, browser chrome, forms, and scripts.
[0] https://browser.engineering
All my projects start with me thinking like that, then many hours, days or months later me thinking "hey it was more complex than I thought".
For 2021 I want to build a personal finance app for myself. The usual me thinks it will take a couple months. The realist me wonders if it will be finished in this decade :)
test <1 becomes test 1
Test< 2 becomes test 2
Test <a becomes test
Test < b becomes test b
(From memory)
What about: Test <fakeTag>?
Per tests i did, "test " was expected however "test <fakeTag>” was seen as the plaintext version suggesting there's a list of valid tags which is filtering the behavior.
The full details are in here somewhere: https://www.w3.org/TR/2011/WD-html5-20110113/tokenization.ht...
It is always working on all the HTML files I have, but then people make new HTML files with other issues.
Why would you want to learn C? To better understand the machine at a fairly low level. I think there’s still a lot of value in that. I’ve found that programmers who never learned C often don’t fully understand how memory management works, for example (not that that necessarily makes them bad programmers!)
Most other non-garbage-collected languages would do the trick, like Rust or C++. But C arguably still has special value in that it’s a lot simpler than either of those -- no higher-level constructs or abstractions to distract you. Maybe Zig will be able to take over that role.
I don't think it stored the packet at all, just advanced a state machine as each word arrived from the MAC.
https://www.edx.org/course/automata-theory
Definitely recommended if you like somewhat dry and mathematical stuff with deep relevance to many areas of computer science. :-)
What are you doing now?
It is so latency-sensitive that new “hollow” fiber-optic cables are being installed, because light moves faster in air than in glass.
From a letter from the CME to the CFTC dated July 24, 2020:
> On July 26, 2020, an enhancement to the Market Segment Gateway (“MSGW”) will be introduced to further safeguard the CME Globex electronic trading platform (“CME Globex”) infrastructure by introducing a delay of at least three microseconds if the MSGW receives a partial order message as a means of ensuring the stability of the platform. Certain participants intentionally submit partial order messages to reduce latency, and only complete the order message upon the happening of an event or trading signal. Implementation of the enhancement is expected to reduce the frequency of intentionally split order messages as the additional processing time will serve as a deterrent.
The letter is on the web, but it's a PDF; the Google search result has a redirect link to it, DuckDuckGo can't seem to find it, and Firefox on Android won't tell me the URL it downloads things from, so I'm afraid I can't link to it!
CME also made a more general change, where if they decide a participant is sending dodgy messages, they will reroute all their packets to a special gateway for "additional checks", but in practice, to impose a latency penalty. Can't find the documentation on that at all, though.
If you can afford to spare part of your income and invest that for long term buy and hold, that'll work better (ex. ETF).
If you can't spare anything, focusing on leveling up trendy skills, then you can land a better job on the short term.
However if you just want to trade for the fun of it, yes crypto can be simpler than stock, mainly because the API are better. Set up a maximum budget as a safety net and enjoy :)
Perl Advent Calendar is my jam.
However your #2 is off-base. There was something like 400% the applicants to the CS program in my university in 1999 vs 1998 and I bet that was true across the board. It was because dot-com was the hot shit and CS became a lot of people's default. The CS department had a tough choice between lowering the bar and "turning away business." This is not a controversial thing.
What happens if you don't do the optimizations? Does the DFA blow up in size, meaning the compile time is large? Or does it make for a slower runtime? I would expect most DFAs to run at about the same speed, unless they are really huge...
I'd be interested in any rough ideas about performance, e.g. how fast a realistic lexer+parser is, maybe in lines/ms.
It does look like the code is pretty short -- a large part of it is an AVL tree library I guess for hash consing?
I'm interested in any downsides of the derivatives technique vs. the NFA->DFA method. I feel like regex compile time shouldn't matter for many applications, and most DFAs will run in the same speed, which only leaves runtime memory usage (or code size for generating F# code like you appear to be doing).
* Normalization: this is where "smart constructors" come in handy; having a normal form for the terms allows the caching to work better. This also impacts the compactness of the generated DFA. * Hash-consing: this turns structural equality (in this case) to a simple pointer equality; applied recursively, this makes it much faster to compare two terms for equality, and overall speeds up the DFA generation by a non-trivial amount (I forget the exact numbers, but it was significant). * Dense set implementation: The AVL tree-based data structure in the facio/Reggie code is an implementation of the Discrete Interval Encoding Tree (DIET) data structure from "Diets for fat sets" and "More on Balanced Diets" papers.
Note the optimizations I've mentioned here impact the performance of generating the DFA. Once you have the DFA, it'll run at the same speed as one generated in any other way. Part of the motiviation for my writing this library was to learn about regex/DFAs/grammars, but also to try to improve on the performance of fslex/fsyacc at the time. Using this library, the FSharpLex tool can generate the DFA for the full F# language grammar in well under 1 sec; the code generation takes a bit longer, largely due to having to convert the DFA into a different form for backwards-compatibility with fslex.
Overall, I feel like the derivatives technique is generally better and simpler, and I'm not aware of any real downsides. The only one that comes to mind is if you're wanting to implement things like backreferences and capture groups -- those obviously make the implementation (of the DFA) more complicated, and there's a lot less literature on it (last I saw, maybe only one or two papers on implementing those features on top of a derivatives-based regex engine).
Although it does seem more suited for functional languages for sure, whereas I basically only have a C runtime.
Capturing might be an issue. I found this
https://www.home.hs-karlsruhe.de/~suma0002/publications/posi...
but I think it's actually being a bit pedantic, i.e. if "almost all POSIX implementations are buggy" then applications don't rely on that exact semantic (they probably rely on the buggy one, if anything ...)
Maybe more relevant: http://www.home.hs-karlsruhe.de/~suma0002/publications/ppdp1...
If I am working on a challenging project recreationally to stretch the limits of my comfort level- say a compiler or a distributed system, the last thing I want is the cognitive overload of trying to deal with C unless I am a C expert already. When working on such projects you should be focused on grokking the problem being solved as opposed to worrying about language idiosyncrasies.
If you want to learn C, it should be done in an environment where it is decoupled from trying to learn another extremely challenging idea. So I definitely would not recommend someone interested in learning about distributed systems attempt to build one in C. If you are trying to learn C you probably build something that you are quite familiar with so that your focus is on learning the language and its concepts as opposed to trying to implement/understand the ideas behind Paxos while also trying to learn safe memory management and pointer tricks.
Also, I learned a lot more about C (and computing in general) by working on complicated projects rather than simple ones. I had built all sorts of simple C projects, but one complicated one taught me twice as much as I'd previously learned. It took me a long time, but it was all valuable because I got to learn what designs didn't work for C and which did.
Anyway, it's up to the reader to decide how much of their time to invest and what they want to get out of it. If you don't want to write it in C, don't, but I personally think the incidental lessons are some of the most valuable.
Although, if you're mainly interested in getting something done rather than using it as a learning experience, using an unfamiliar language is probably a bad idea. You'll learn a lot about a new language, but you probably won't build anything you could actually ship.
Only as long as you avoid any fancy libraries that do a lot of the work for you.
Granted you can use fancy libraries in C too...
What you are describing is the most basic, canonical form of latency arbitrage.
That idea is implemented widely in the HFT industry across all sorts of products, not just crypto, with billions spent on trading the information faster
(Source: this is my job).
Not being rude or judgemental, just want to know how you feel about this.
I think your question makes some incorrect factual assumptions, but also it incorporates a world view that I don't subscribe to (not wrong, but also not what I believe).
I think that if I had your belief system and your set of facts, I would probably feel uncomfortable about it.
Crypto is one level more complicated, because you don't share inventory across exchanges and transaction costs are high, which means a coin on Exchange1 isn't perfectly fungible with a coin on Exchange2.
One level more indirect than that might be basis trades, where you trade a derivative ( like SP500 futures) vs it's underlying (the SP500 stocks, although in practice it's SP500 ETFs). So here the correlation is very high but there is a difference between futures, stocks, and ETFs fundamentally, and those play into the pricing.
Going even further might be trading correlated products that don't have the same underlying, example is Nasdaq futures vs SP500 futures.
To simplify: It's basically about the level of correlation between the products. The strategies used to trade different correlations look qualitatively different.
Other place to look is whether the data was recorded with timestamps of where the trading happened, but you probably thought of that one.
Idea makes sense though.
Why do they have to be profitable ? Many programmers like solving challenges to learn and have fun - it's not all about money. Advent of Code is a great example of this.
If you could choose between doing a contrived homework problem and learning something VS. creating something original and learning something, you should always choose the latter, hands down. Learning on the job or while you are solving a unique problem that you have is always preferred to doing a homework problem.
Why would you took offense in that or why would you assume that everyone is trying to build company is beyond me.
In any case, this list is literally for programmers. It is not for founders. It is not meant to earn you money. It is list of small enough challenging projects that force you to learn new technical skills.
As for "my" set of facts... well they're just facts :) not "mine" or "yours".
My point is that I'm curious regarding how the people who work in these things feel about their job. Do they think they're doing a right thing? Do they "own it" to themselves that they don't care as long as they make money? Do they not think about it too much?
in return HFT provides liquidity (and by that smaller spreads) to the market, I would think.
If you ever bought or sold a stock with a market order, you profited from the work the HFT is doing.
Isn't Robinhood able to offer free trading to customers solely because they sell their order flow on to, among others, HFT firms?[0]
[0] https://www.cnbc.com/2020/08/13/how-robinhood-makes-money-on...
His job facilitates trades between people who want to trade with each other. Since they are part of society, our society benefits from his work.
Evidently it's doing more than "make some already fabulously rich people even richer", because it's paying salaries for some programmers, as well.