"Buy Me A Coffee"

  • 3 Posts
  • 61 Comments
Joined 2Y ago
cake
Cake day: Jun 13, 2023

help-circle
rss

If an instance goes down (permanently), federation of all of the communities hosted by that instance essentially stop. The content that has already been posted remains but anything new added to those communities only remain on your home instance. The only way for federation to resume is for that instance to come back online with the same domain it started with.


Same. Honestly I need to create a community just for this tool IMO. But I don’t have the time to moderate it.


https://www.search-lemmy.com/ ?

I’m open to feedback though if the search results seem out of order etc…


Unless you have an account there’s no easy way to get access to the content on the page. Once you have an account there’s technically nothing stopping you from just saving the HTML file to your computer.

Something else you can try though, assuming you don’t have an account, is to just turn off JavaScript. If the site lets you partially load the content and then asks you to create an account to read more, they usually just block the content by having JavaScript add an opaque overlay. With JavaScript disabled, obviously it’s not there to add the overlay and you’re able to keep reading.


That looks like 8.8.8.8 actually responded. The ::1 is ipv6’s localhost which seems odd. As for the wong ipv4 I’m not sure.

I normally see something like requested 8.8.8.8 but 1.2.3.4 responded if the router was forcing traffic to their DNS servers.

You can also specify the DNS server to use when using nslookup like: nslookup www.google.com 1.1.1.1. And you can see if you get and different answers from there. But what you posted doesn’t seem out of the ordinary other than the ::1.

Edit just for shits and giggles also try nslookup xx.xx.xx.xx where xx.xx… is the wrong up from the other side of the world and see what domain it returns.


Another thing that can be happening is that the router or firewall is redirecting all port 53 traffic to their internal DNS servers. (I do the same thing at home to prevent certain devices from ignoring my router’s DNS settings cough Android cough)

One way you can check for this is to run “nslookup some.domain” from a terminal and see where the response comes from.


It searches the title and body. It also automatically searches for similar words like. Like ‘bike’, ‘biking’, ‘bikes’ (aka stemming). Granted though, I’m still improving the page ranking as time goes on.


There is a public API now. While I won’t support sorting, you can process and do what you will with the results as-is. Currently I only support Posts and Communities for now.

When you search for posts you’re just matching against the title or body. For communities it’s searching the posts within that community.

There’s also more filters now with: instance/community/author/since/until and a safe-search option.

So I’m not sure how close this comes to your idea but I thought I’d share.



Yep. The idea is that the instance you select is the instance that all links will open up in. This way once you find a post you’re looking for you don’t have to find it on your home instance and can immediately start replying, save it for later, subscribe to the community, etc…

Edit: I do plan on “fixing” this eventually, but I’m waiting on a bugfix on lemmy itself. You can see more here: https://github.com/marsara9/lemmy-search/issues/20


Just pushed a new update, that should fix an issue with the home-instance selector. Since you’re on lemmy.world I’d suggest selecting that instance from the drop-down and trying your search again…


Bummer. That’ll slow down Kbin’s inclusion into this. Well once it is available I’m sure I’ll start digging into it. But thanks for the info.


I’ve already started to abstract away Lemmy from the search engine itself. So the first steps are in place. Once I get the kinks of the 0.4.x release knocked out then I plan on reading up on Kbin’s API and I’ll start working on the crawler. I can’t promise anything but that should give you a rough timeline.

If you have any programming skills I could always use a hand.


Check out my post history.

But https://www.search-lemmy.com. It has a few bugs but it should work for you. Especially if you set your home instance to something large like Lemmy.world.

Edit: if you want to help contribute: https://www.github.com/marsara9/lemmy-search


Lol. I guess I better have this working in 4 weeks. /jk


It’s on my to-do list. Sadly though, in order for something to show in the drop-down for home instances that means I must have previously crawled that site. Because my #1 requirement is that if you click a link it must open in your home instance. Good news is, is that Kbin and Lemmy work nearly identical to each other, so Kbin will be the first non-lemmy type of instances that you can search.


Thanks. It shouldn’t be long before I start considering the APIs stable enough that I can maybe reach out to the 3rd party app devs and then maybe who knows…


Ya I index both post titles and the post body. I also weigh the body content slightly higher as well. So posts that just link an article will usually show lower than posts that actually have content.

At some point though this will change. As eventually I’ll start adding comment data to the index as well. But I’m waiting on a bug in Lemmy itself to be fixed before I begin working on that.


Make sure you select your home instance from the drop-down.

I’ve got an outstanding bug where first time users are defaulting to an obscure / small instance that doesn’t have much content.

https://github.com/marsara9/lemmy-search/issues/45



Still working for me. You might need to clear your browser cache as I did make some UI changes as part of the update.


Eventually it will. But there’s a bug preventing global search working in lemmy itself. You can see more of the details here (https://github.com/marsara9/lemmy-search/issues/20 )

One of my primary goals with this is that users MUST be able to open a given link in their home instance so that they can then interact / reply / subscribe, etc… without having to figure out how to find said post themselves. So with that requirement, users MUST select a home instance but because of the before-mentioned bug I can’t show posts that your instance isn’t aware of.


A new way to search for communities
I keep see people complaining about not being able to find active communities that match their interests. So I've added a new feature to https://www.search-lemmy.com/ that allows you to search posts for a particular topic and then it tells you which communities have the most posts matching your search query. And assuming that you've set your home instance correctly, those links will even open up in your home instance, so that you can subscribe directly to them. For example, if you search for 'linux' (https://www.search-lemmy.com/find-communities/results?query=linux&page=1) it gives you a link to each community, tells you which instance it's on and how many matches it found for your query. All of the same filters that you can use on the normal search can be used here as well. So if you just want to find the best community that mentions linux on lemmy.world (https://www.search-lemmy.com/find-communities/results?query=linux+instance%3Alemmy.world&page=1), you can filter by just that instance. Click on the `Search Tips` button to see a list of all of the available filters. P.S. I'm aware of https://lemmyverse.net/ etc... and while those are great as well, this allows you to search to see what people are actually talking about on the various communities. Again, if you have any feature requests or find any bugs, PLEASE reach out or ideally go to my github (https://github.com/marsara9/lemmy-search) and log a bug there.
fedilink

Here’s the landing page if you just go to https://www.search-lemmy.com/. I’m assuming that drop-down that you’re referring to is your home instance selector. Since you’re on lemmy.world I suggest you set that to well, lemmy.world. Then you can do your search and all of the results will take you directly to that post on lemmy.world (or whatever you set as your home instance).

Now you can also see that Find Communities button in the top right, you can click on it and it’ll take you to a similar page but instead of returning posts for search results, it will return a list of communities, based on how many matches it found. (as if you did a search on the normal page but instead just counted the number of results per community).


https://www.search-lemmy.com/find-communities/results?query=camping&page=1

It’s under the “Find Communities” button at the top of the screen. If you don’t see that button, try clearing your browser cache.


Just added a new feature that lets you search but it returns the number of matches per community. So you should be able to use that to find the most active communities based on your search result.



Thanks. Adding an issue for that. I should be able to set the default instance to the ‘seed-instance’ that’s configured for the crawler.


Look under search tips for the filters at least.



some query community:[email protected] would search your community. Now, if this community is less than 24hrs old, it may not have been indexed yet, so you may just need to wait a day or so.

I just tested with https://www.search-lemmy.com/results?query=test+community%3A!ultralight%40lemmy.world&page=1 I was able to at least find 2 posts.


I’ve been seeing a few people ask for something like this recently, so I might try and see how hard it would be to build something to help find active communities.


Currently you can just search for posts. I don’t track anything like the number of members in a community etc… just the content of the post and how more or less accurate they are to your current query. I’m continuously trying to improve the page rankings though.

I guess in theory you can perform the same search multiple times with different community:!some_community@some_instance filters to see which returns the most results, but ya, that wouldn’t be the most convenient. At the moment this tool though is about finding posts, but who knows what features I may add in the future.


This doesn’t change the behavior of the built-in search within Lemmy. But rather this is suppose to be a close approximation of using Google with adding reddit to the end of your query.

The problem with the fediverse is that there are so many different instances you can’t really include them all in a search query and even if you could the links that Google would provide wouldn’t necessarily go to YOUR instance. This aims to fix that.


Search Lemmy 0.4.0 update (an enhanced search engine for Lemmy)
A couple days ago I updated https://search-lemmy.com/ to 0.4.0. New features, that several people were asking for: * The UI has been overhauled and it should be much easier to find your home instance now. * Search itself has been overhauled. Increase search performance significantly. I also automatically search for related terms as well. You may now see fewer search results, but ideally they should be more relevant. You can also now include basic syntax like: * quotes: "some terms that must be together" * negative terms: `cat -dog` (shows posts about cats that don't mention dogs) * either or: `cat OR dog` (shows posts about either cats or dogs). The default search behavior is now an implicit AND, but order doesn't matter. * I've added several new filters that you can use including: * `!safeoff` -- Disables safe search allowing NSFW posts to appear in the search results (NSFW is now hidden by default) * `since:YYYY-MM-DD` -- shows only posts that have occurred since the specified date * `until:YYYY-MM-DD` -- same as above but in reverse. It will only posts up to the given date. * I've removed the preferred-instance query parameter from the results URL so it should be easier to share links to search results now. * The date the post was created or last updated is now displayed in the search results. Bug Fixes: * Site performance should now be stable. Fixed a bug related to the database pool that was causing the site to hang. * Fixed a bug that would cause broken links. * Fixed various bugs with the crawler causing posts to be missed. Known Issues: * If you set your home-instance to a fairly small instance, the number of search results is also relatively small. Once (https://github.com/LemmyNet/lemmy/issues/3259) is resolved. I should be able to show links regardless of what your home instance is set to, allowing you to search the entire Fediverse. * Currently searching only looks at the post title and body. Comments aren't indexed either. This also is dependent on the above issue on Lemmy itself. Finally some things to note: I've started to refactor the code to abstract away Lemmy from the actual search engine. As I now start to prepare to search other Fediverse instances like Kbin, and maybe even Mastodon, etc...
fedilink

  1. Yes most trackers have something on their website to let you know what your ratio is, what you’re downloading and how long you’ve been seeding those files.
  2. With the trackers I’m familiar with yes – seeding for 9d 23h 59m and 59s is the same as seeding for 0s. You’ll still get tagged with a HnR (Hit and Run)
  3. You can shutdown as much as you like. But, again the trackers that I’m familiar with have a cap on the number of HnRs you can have on your account. So you might have action taken against you if you’re seeding 5 different torrents and decide to shutdown.
  4. Don’t know.
  5. The rest don’t appear to be questions so not sure how to respond.

Cloudflare? Namecheap?

Not sure exactly what features you’re after but the vast majority of them support what you mentioned above.



Btw I appreciate the fediverse and decentralization as much as the next guy, heck I’m even writing software for the fediverse. But I feel like there’s a handful of people out there that want to try and apply the fediverse concept to everything. Similar to what happened with Blockchain. Everyone and everything had to be implemented via Blockchain even if it didn’t make sense in the end.

IMO though, GitHub is just one “instance” in an already decentralized system. Sure it may be the largest but it’s already incredibly simple for me to move and host my code anywhere else. GitHub’s instance just happens to provide the best set of tools and features available to me.

But back to my original concerns. Let’s assume you have an ActivityPub based git hosting system. For the sake of argument let’s assume that there’s two instances in this federation today. Let’s just call them Hub and Lab…

Say I create an account on Hub and upload my repository there. I then clone it and start working… It gets federated to Lab… But the admin on Lab just decides to push a commit to it directly because reasons… Hub can now do a few things:

  1. They could just de-federate but who knows what will happen to that repo now.
  2. Hub could reject the commit, but now we’re in a similar boat, effectively the repo has been forked and you can’t really reconcile the histories between the two. Anyone on Lab can’t use that repo anymore.
  3. Accept the change. But now I’m stuck with a repo with unauthorized edits.

Similarly if Hub was to go down for whatever reason. Let’s assume we have a system in place that effectively prevents the above scenario from happening… If I didn’t create an account on Lab prior to Hub going down I now no longer have the authorization to make changes to that repository. I’m now forced to fork my own repository and continue my work from the fork. But all of my users may still be looking for updates to the original repository. Telling everyone about the new location becomes a headache.

There’s also issues of how do you handle private repositories? This is something that the fediverse can’t solve. So all repos in the fediverse would HAVE to be public.

And yes, if GitHub went down today, I’d have similar issues, but that’s why you have backups. And git already has a solution for that outside the fediverse. Long story short, the solutions that the fediverse provides aren’t problems that exist for git and it raises additional problems that now have to be solved. Trying to apply the fediverse to git is akin to “a solution in search of a problem”, IMHO.


I don’t get what benefit hosting your own git brings to be honest

Just another level of backup. Personally I tend to have:

  1. A copy of my repo on my dev machine
  2. A copy on a self hosted git server. Currently I’m using gitbucket though.
  3. A copy on GitHub.

This way I should always have 2 copies of my code that’s accessable at all times. So that there’s very slim chance that I’ll lose my code, even temporarily.


IMHO federation doesn’t bring any real benefits to git and introduces a lot of risks.

The git protocol, if you will, already allows developers to backup and move their repositories as needed. And the primary concern with source control is having a stable and secure place to host it. GitHub already provides that, free of charge.

Introducing federation, how do you control who can and cannot make changes to your codebase? How do you ensure you maintain access if a server goes down?

So while it’s nice that you can self host and federate git with GitLab, what value does that provide over the status quo? And how do those benefits outweigh the risks outlined above?


Let me introduce you to https://sense.com/ and help you create a new obsession.

P.s. it’s not perfect as it uses machine learning to determine your appliances and it can’t find electronics like your computer or TV but it’ll help you find what might be chipping away at your power bill.


Announcing a new Search Engine for Lemmy
I shared bits and pieces of this before, but it's officially up and running now: https://www.search-lemmy.com/ This is an enhanced search engine for Lemmy. With a few primary goals: * You can choose a preferred instance. After choosing what your primary instance is, and performing a search ALL links will open in that instance. * This aims to be a replacement for using `site:reddit.com` in Google, but just for the fediverse. * You can filter the search results by: * Instance -- This will filter the results to only show communities that belong to a particular instance. Just type something like `instance:lemmy.wrold` or `instance:https://lemmy.world/`. This is separate from your preferred instance, such that you can search for posts on lemmy.world while still opening them on lemmy.ml. * Community -- You can refine the search by a specific community. You use the same syntax that you'd use here `community:[email protected]`. * Author -- Similar to the above you can also filter by a specific author such as: `author:@[email protected]`. * The entire thing is open-source. You can view the code and even host your own instance... See more details here: https://github.com/marsara9/lemmy-search. NOTE: This only supports Lemmy instances for now. Other fediverse type instances may be in the future depending on how this works out. I've been working on this over just the last few weeks, so it hasn't had a chance to crawl much of the fediverse yet. For now it only supports `lemmy.world` and `lemmy.ml` but other preferred-instances will come online as time goes by. If anyone finds any bugs, and I'm sure you will, or if anyone has any suggestions PLEASE raise an issue on GitHub for me to track. Lastly, if anyone wants to help contribute please feel free to reach out. **NOTE TO SERVER ADMINS: You can prevent your site from being crawled by adding `lemmy-search` to your robots.txt for the user-agent.**
fedilink