If your text element is in the world space, then you can find the apple’s location and place your text element closer to camera from that position. This way, it will be in front of the apple, but will be blocked by other objects, if they are in the way.
If your text element is in 2D Screen space, then it will always be in front of all objects that are in the world space, but you would need to calculate the correct position, so it is shown on apple for example. If you make a custom shader here, then just make sure the z depth of the buffer. I’ve learned this through the help of @Leonidas here.
And yes, to make it darker/transparent if something is in front of it, would require a custom shader. If you go with the world space, then you could use the chunks system, instead of writing a completely new shader. But it is not well documented yet.